15ri5r a^rwsr 

{ L. B. S. National Academy of Administration 

I _ *1*. 


Accession No._ 
inr^TWT 
Class No.. 
5WT BWT 
Book No. _ 


MUSSOORIE 

UBRARY 


JWo 
3it _ 


« 


\ 


312 

Wol 



I 


14420 


1-B8NAA 





POPULATION STATISTICS 

AND 

THEIR COMPILATION 


Revised Edition 


UY 

HUCJH 1-1. WOLlMuNDKN 


Fello'JC of the Institute of.Ictuaries^ (ireat Hritain 
Felloio of the Society of Jctuaries 
Felloii of the Royal Statistical Society 


WITH AN APPENDIX ON 

Some 'Vheory in the Sampling 
of Human Populations 

BY 

VV. EDWARDS DKMINCi, Pii.D. 



P V » L I S 11 E I) V 1) R T HE S O C I E I Y OK A C I’ U A R I E S 

B Y 

THE UNIVERSirV OF CHICAGO PRESS 

J 9 5 4 



Thk Umvermty i>f Chicago Press, Chicago 37 
('ainbridg<r Uni\cr.sity Press, I ondon, N.W. i, England 
I In* I'nixersily of roroiito Press, I'urDiito 5, Canada 

Copyright njjji by the Jctuaridl Sticiety nf .'imericti; 

‘^V/V/v 0/ Aituunes [successor to the 
Actudr'uil Society uj .Imerica). Ail r^hts reserve 
Reviseii FJitlon publishe.l ('omposed and printed 
by ThkUnivkrsu yofCiiicac.o Press, Chicago^ Illinois^ 
U.S.A. 



FOREWORD 


'riie first edition of this book, by the same author, was juiblished by the 
Actuarial Society of America in 1925. In ])rc|>arin|' this sivoiul edition the 
author has chanj^ed the text materially by brinj'inj' historical and other 
data up to date, by thorough revision throughout, and by many additions 
to the text which embody a numlicr of developments, especially recent 
ones. The value of the work has thereby been increased substantially. 

'riie Society of .Xctuarics, as successor to the Actuarial Society of Amer¬ 
ica, apprei iates the opportunity to publish this book. Rellecting the au¬ 
thor s long an<l intensive study in the field of t)oi>ulation statistics, tins 
])ook has the added value of his wide practical experience in that lielil. 
'riiis volume shouhl be of great assistance to students who wish to ac(juire 
a mastery ni both the elementary and advanced aspects of this subject. 
We believe that it will be consulted by demograjdiers outside the actuarial 
profession as well as by actuaries. 

'riie Society gratefully acknowledges the author’s devotion to his pro¬ 
fession in contributing so much of his lime and effort to this publication. 

John R. Larus, President 
Soeieiy of Adnaries 


V 




PRKFA(T. 


A considerable number of major devolo])menis liave occurred in the 
theoietical and urartical problems involved in thecompilati«)n of |>o])ula- 
tion statistics sime this book was first published under its present title, 
as '*Actuarial Study \o. d,’’ by the Actuarial Society of America in P>25. 
'Fliis new edition aci ordinijly has been re-writleii ami enlarjjje*! throuj^h- 
out, in order to im or|)orate the necessary ilescriptions and references 
coverin*^ that new material. 

In Section II the history of ceiisus-takini; has l)een clealt with more 
fully in the li;'ht of various papers which have ap|)earcd since and 
the details concernin;' modern censuses have been broii.u:ht up to date. 
Similarly the discussion of rej»isl rat ions of births, deaths, and marriaj'cs 
in Section III includes recent contributions. Section IV, tm the reliability 
of census and re^'istration statistics and the nature of the errors therein, 
also now ^ives data which have become available for various countries 
sime imblit ation of the first edition, and the* extent of under-enumeration 
and under-reiristration is ccinsidered. In Seciion V emphasis is given to 
the importance* of preliminary atljustmenls fiir errors of age under semie 
(irciimstames, and the treatment e»f population estimates includes an 
examination of the elu.Mve problem of projections. 

Seciion Vf, elealing with the mathematical relationships b(‘tween 
births, eleaths, and populations, and the resulting formulae bir the rates 
of mortality, now includes formula MSu) devised by Miiriyama and (Jre- 
ville, ami has been enlarged with additional explanations ami proiifs. 'I'his 
latter material has been brought in to emphasize the importance of the 
ileviceby whieh the assumptions of uniform distributions can be stated in 
terms of jiroliabilities, and to direct attention more effectively to the very 
easy and rapid manner in whic h the mortality formulae basc‘d on the uni¬ 
formity asMim])tions can be reached by the use of a simjile Lemma both 
of which methods were first stated in my pa|)er in 'r..\.S.;\., XXIV, 126, 
from which the explanations have been taken. 

Some further history has been included in Section VII on the construc¬ 
tion c)f mortality tables, and the latest methods used in (Ireat liritain and 
the United States have been ad<h-d to the descri|)tions previously given. 
In this chapter the c:onstruction of population tabh*s is discusseci, neces¬ 
sarily, in the light of the fact that it also involv(‘s inherently the problem 
of “graduation*’ the two processes of constrm tion and graduation f)eing 
jirac’tically inseparable in such work, in contrast with their almost distinct 

vii 



viii Preface 

nature in the actuarial problem of preparing tables from the records of in¬ 
sured lives. The errors in population data are so varied, and their correction 
demands at many stages the use of methods of such special types, that it 
is not possible for a student to comprehend the questions which are posed 
in these procedures unless the graduation principles are studied carefully 
and arc well understood. The graphic method, curve fitting, tangential 
and oscillatory interpolation, and linear compounding coefficients which 
mininji/.e the mean .square error in a specified order of differences arc 
therefore all described with details which it is hoped will be .sufficient to 
enable the student to grasp the underlying principles and so equip himself 
with the means of choosing appropriate methods under any practical 
conditions he may meet. 

Iwirmulae recently suggested have been incorporated in Section VITI 
on the construction of abridged life tables, and slight amplifications 
have been made in Section IX on the methods of comparing the mortali- 
ties of different communities. A new chapter on the forecasting of mortal¬ 
ity rales has been included as Section X. The discussions of mortality by 
cause of death in Section XI, occupational mortality in Section XII, and 
the use of census and registration data in the compilation of statistics re¬ 
lating to marriages, birtlis, orplianhood, unemployment, etc., in Section 
XIII, have been enlarge*!. 

In this edition a new Section XIV dealing with the modern theory of 
reproductivity has been inserted, with the object of explaining the i)rin- 
* iples and limitations of gross and net reproduction rates and the inherent 
rate of natural increase. 'Fhc final Section XV on sickness data now in¬ 
cludes refirences to recent standard diagnosis and classification codes. 

In the belief that students are assisted materially when methods arc 
identified by the names of their originators, the 'I'able of Contents has 
been compiled in more detailed form tlian in the first edition (thus in 
effect eliminating the necessity for an index also) with the appropriate 
names attache*! t*) original and important contributions. 

'rhr*)ugh()ut these enlargements of the original edition, special atten¬ 
tion has been paid to the needs of actuarial and other students as tho.se 
needs have been revealed during a l*)ng cxixTiencc of lecturing on this 
subject to students at the Actuarial Society an*l elsewhere. If a student is 
t«) Ik* considered well informc*!, it is not sufiicient for him to assume, for 
example, tliat some particular formula, however great its supposed au- 
tiiority may be, can be a]){)lied in any practical circumstances without a 
careful examination of rival methods. Furthermore, the underlying prin- 
cii'les and genesis *)f each jirocess must be fully untlerstood. Without 



IX 


Preface 

such knowledge, a student cannot assess the conditions proper for each 
method’s application, and- even more importantly- he will not be aware 
of the circumstances in which it should not be used. The rapid and un¬ 
discriminating presentation of so-called routine methods thus can never 
constitute good or adequate education; any attempt to “teach” a subject 
such as this in capsule form, or in a prescribed and curtailed number of 
j)agcs or hours, can produce only insuflkient preparation full of pitfalls 
waiting to engulf the uninformed. Hard work and concentrated thinking 

both of which take time and cfTort—will always be essential in the 
handling of technical j)rocedures which naturally demand knowledge, 
judgment, and exj^rience in tlieir pr.actical ap]>lications. 

In conformity with this objective of maintaining thoroughness in our 
educational programs, references are given in the appropriate places to 
my companion volume on “'I'hc Fundamental Principles of Mathematical 
Statistics (with Special Reference to the Requirements of Actuaries and 
Vital Statisticians, and An Outline of a (bourse in Oradiiation),” which 
was published by the Macmillan Company of Canada for the Actuarial 
Society of America, New V'ork (now the Society of Actuaries, Chicago), in 
P)42. In that volume readers will find ex))lanations of many portions of 
the underlying mathematical theories which are too extensive for restate¬ 
ment here. As was remarked in the Preface to that book, history is not 
ignored for it is, in my belief, “essential to the proper understanding of 
any subject to absorb the history of the mental pTocess(‘S which have 
guided its dcvclof)menl.” Dctaile<l n*ferenccs also are given in both these 
volumes, so that students and research workers may be able, without 
frustrations, to find original and more elaborate sf)urces on dillicult or 
(hallenging points which inevitably arise in the minds of all enquiring 
readers in subjects of this kind. 

'Fhe entirely new Appendix on Sampling has been written *>11 my invi¬ 
tation by Dr. \V. Kdwards Doming. IJis acknowledged pre-eminence in 
that field made it especially appropriate that he should be the one to 
prei)arc a condensed statement of the theory for inclusion in this volume. 
My thanks in full measure are due to him for his cordial assistance. 

Dr. I'homas N. K. Crevillc, F.S.A., has collaborated most generously 
and effectively in the preparation of material for this book clealing with 
those aspects of population statistics on which he has written so many 
original and valuable pajxjrs. Mr. Robert J. Myers, F.S.A., similarly 
has made notable contributions on several matters, and, like Dr. (ireville, 
has made a critical reading of almost the whole of the completed manu¬ 
script. Dr. Deming’s Appendbe, to wdiich earlier reference has been made, 



X 


Preface 

was j)rcparcfl in the s:ime ro-oi)erative spirit. And the final arrangements 
for publication have bwn assisted most significantly by Mr. Wilmer A. 
Jenkins, I'’..S.A. (as Cliainnan of a Siiccial Committee of the Society of 
/\cluarics, which included also Messrs. E. \V. Marshall, A. Pedoe, R. (J. 
Slagg, an<l J. S. Thomjison). To all of these collaborators f wish to record 
my appreciation fi>r their interest and invaluable help. 

Hugh II. Woltenden 



TAHLE (W (OX rKN rs 

SEC I in.v 

I. INTRODn'IORY 

Si'Djic i»f Study—the Scirncc Ilf ... 

I'nlur of Study of Vihil Sto fist its 
Uses cif (Vu'^us and Rcijist rat inn Systems . 

'I'he I m| nut a nee nf Sncial Statistic*;. 

't ypes nf Inl'nrmalinn OI)t.lined at the IVnsiis .... 
The Nature <if Kmimeratinn and Re:^istr.itinn ... 

II. niKUKNSUS 

Ilistnry nf Uensiis-lakin.i'. 

Dales nf Variniis Nalinnal Pt»pulati<in (’ensii.si's . . . 

Cnu rol Print iph's nfi t nsu^ hikinii 

HaMc Prineij-Ii s. 

"|)e I'aeln" and “D«‘Jure” Methnils . . 

‘‘IInu.m*h'»Mer*\ind‘‘Uanv.is^T*’Melhnils ... 

Dale nf (ViiMis, and Inlercensal IVn’tul 

'I'he United States Pnpulatinn ('eriMis ... 

The Uensusi s <»f (’aiKula .... 

'riie (Vnsiisesnf ihe United Kinj^dnin and Kire . 

Ui'iisii.ses nf Other (’nuntries . . .... 

III. TDK Ri:(;iSTR.\TlON Ol* HIR'I HS, DK.VmS, .\ND .\1.\R. 
Rf.XU.KS 

(leneral. 

Ilisttiry nf Rejjistratinn Sy'*teins ... . . . 

Rej^istratinii in the United Slate*.: 

Devi lnpnuMit nf the System 
'Phe Re^n'.-lralinn .\rea f«»r Deaths . 

'Hie Re'ji'.lralii»n .\rea fnr Births . 

Marria.ire and Divorce Stati'.tics . 

'I'he Present Unifonn St.itisiieal Pnnedures and Certifirati s . 

Re}ji'<lr.iiinn in Canada. 

Rejiistrataui in the United Kinydninaml liire . 

Re^d''* rat inn in Other ('<»untru-s. 

'I'he Finuiainental Rc»inirenieiits fi r Sati-factnry Birth and Death 

Rj'LU'tr.ilinii. . 

IV. 'niK RKLI \B1LITV OF (’F.NSUS .\ND RKfdSTR.VriON STA 
TIsriCS, AM) TIIF NATURF OF TDK KRRORS TIIFRKIN 

Errors in (t iisus Stotistit ^ 

SnirciMif and Rea j«ns for Frrnr**. 

Tyj-e; of F.rpir.*. ... . 


i'\K\e.K\eii 

1 


4 

5 

f) 


S 

») 

ID 

II 

\,l 

l.< 

It 

Is 


W) 

17 

IX 

ID 

.?D 

.21 

>.> 

21 

2=i 

2<i 


27 

2X 


.\i 









Table of Contents 

PAKAORAril 

(1) Tin Dcficirncy in Ihc Xiimhorof Infants: 

Hritish Data.29 

i;.S. Data.30 

'I'liir 'IVndoncy To Omit. Young (’hildrcn .31 

Misstatements ol Age of ('hildren Actually Knumeralefl; 

Dunlop’s 1911 and 1921 Census Samples; Xew Zealand 
1921 (Census Sample.32 

(1) Heaping at ('ertain Digits of Age: 

Age and Sex Distribution Diagrams.33 

'I'he Order tif Digit Selection- -U.S., British, and Indian 
StaMstics; Methods of Analyzing Concc*ntratioii on 
Particular Digits Used l)y Ackland, King, Meiklc, 
Vaidyanathan, W(dfenden, and in the U.S. and British 

Otlicial Ke|n)rts.34 

(Jeneral Measures of Digit ('oncentralion—\'oung’s (’o- 
c llieient of Krnir and Test of Minus DifTerences; Alyers’ 

Blended Methoil and Index of Preference; (ireville’s 

Analyses ... 35 

Differences between Ungradiiated and (iraduated Popula¬ 
tions as a 'Pest of Digit Krnirs -Dunlop’s Scottish 
Data; Knibbs’ Australian ('otnparisofis; King’s Ratio 

Method.3b 

Australian and New Zealand Samples of Actual Misstate¬ 
ments 37 

(3) Overstatement of Age until Majority, then Understatement, 

and Overstatement at Advanced Ages: 

'Pile Nature of and Reasons f«»r Such Krn)rs .... 38 

Measurement.of the Krrors King’sanrl Derrick’s British 

Analyses. .... .39 

The Kffects »if OM-Age Pension IA'gislation .... 40 

Statistics of (Vntenarians .41 

MtTects tif the Form of the Age Query, anti Illiteracy 42 

(4) Persmis of Unknown Age Kxtent of and Reasons for 

“Unknown” Returns; Deming’s Method of Flimination 43 

(5) Under-enumeration --U.S. Data 44 

Errors in Rt'flisiration Siatistu s 

Reasons f«)r Under-registrations. 45 

(1) Under-registration of Births; Siimple Investigations at the 

1931 and 1941 C'anadian (Vnsuses; 1940 U.S. (Vnsus 
Data and Analy^’s; Method Sugge.sted by Chandrasekar 
and Deming 46 

(2) Under registration <if Deaths Investigatii^ns Made in 

('oustrueling the U.S. 1939-41 Life. Tables ... 47 

(3) Stillbirtli.s- -Thidr Definition and Registration . 48 

(4) Age at Death, C'ause of Death, and Occupation . ... 49 

(5) Deaths and Birihs of Xon-rcsidenl.s, and Chililn-n Born of 

X«>n-reswleiit Parents.50 












Table of Contents 


X111 


SFCn*»N' PARAr.RAPlI 

V. PRKI.IMINARY ADJUSTMKNTS FOR FRRORS OF Al’.K I\ 

CENSUS AM) REC.ISTRATION STAT1STI(\S; AND KSTI- 


MATES OF roPCLA'nONS 

(1) PreUmUhiry AdjHstnicnls for Errors ojAi^t' 

AiljustniPiit for Unknown A^fs .SI 

Preliminary Rnlijiti;!»uti«»n'; fnr I)i^;il ('onivnfraliiMi - 
Methods in the Austrian, U.S., and Indian Life'Pahles .S2 

CirouiiinK Piincii'les and Mellnuls..S.^ 

'Hie Analysis uf Various (Innipin^'s Methods Illustrateil 
by Votin;', (Hover. Papps, Kiiijr, Myers, (Ireville, Meikle, 
\aidyan.tihan. ami in Various Reports..S4 

(2) Esthtiiifrs of t'opaliitiofis 

Intereensal and Pooiet fisal Pnpidali«)ns ... . . 5.S 

Mean Populations.,S() 

'I'lie Statistical Method of l‘Miinati'Ui .57 

Special Methods for Estimating Ijocal Populations Early 
Methods, Sno\v\ Use of Multiple Correlation, ami Pn»- 
cedures I)evelope«l in the Unitid Slates 5S 

Arithmetical Pro;»res.viiin. .... .S*) 

(leometrie.d Ihduressimi, ami Its la’mitatiims .... (>() 
Modified (K‘«»melrical Pn»nres.sion 'I'o Adjust (iroup Values: hi 

(l.P. Multipli«'d liy a Katu*.hi (i) 

.\. (\ Waters’First Metht»d Formulae and I)('f«rt hi (ii) 

A. Waters’SiTomI .Method Formulae ami I)efet t hi (iii) 

Traversi’s Method. hi (iv) 

'riie (’oniljini d Pro-jjres.sion Method.h2 

(^) The Predietion of Future PopnlrJions 

The Danger of population Prediction .... . h.l 

Finite I)ilTereme Methods and Parabolic Curves h.l (i) 

Methods Proposed by Stevenson, Knil)bs, and (It hers . h( (ii) 

'File D';;istic Curve (of V«*rhulst, ami Pearl ami kee<l) 

Mathematical Foim, Methoilsof I* it tin}.;, and bimitalious h.l (iii) 


'I'he Use of Survival Ratios fnuu Life 'labh-s Methods 
U^d in the U.S. ami (in-at Kritain, and the UnoTtain- 
tiesof Loii.l: Kan;^e pi>pulation Estimates.fi.? (iv) 

VI. THE .MAIIIEMAI’ICAL RELATIONSHIPS RKTWEEN 

iHkriis, DE.vrns, and popul.\tio\s, and the for. 

MULAK FOR TIIF RATES OF MORTALI I V 
Fumlauiental Relationships Involvin;; (’alemlar Years and Years 


of At;i: -Notation and Formulae. .64 

('alculation To Divide tin* ('alendar Year’s D«aths ;\cco»-(lin).; 
to the Two (ieneratioiis fnmi Which 'I’liey Arise Methods in 
U.S. Life Tables by (Hover ami (Ireville, and Recent U.S. 

Siim|)le Values.65 

( alculation of /.’j 'Jo Divide the Deaths of the Year of Ako . . 66 










XIV 


Tiible of Contents 


SKCTION PAR AG RAIMI 


Foniluitc for the Rales of MorUility 

on iMovcmfiil of the Ptipuhilion Over the Year of 

(tr I he ('jileiidar Year.67 

'riie Type I I'onnula (3ver the Year of Age.68 

'I'he 'I'ype III Fonnula Over the Calendar Year.69 

Fnnindae in Term's of E and D, Rase*! un Relating the CalcMidar 
\ ear’s Deaths \n the Births fn»m Which They Arise ... 70 

Fonmilae in 'rerrnsof E and />, Based on the (’alendar Year’s 

Births and the Deaths tti Which They (iive Ris(‘.71 

Fnnisiilae in Terms <»f Pand I) .72 

Fnrniiila in 'I'crm'iof E, /', ami D . 7.? 

Foni.ulae in 'I'erms i»f E, 5, and D .74 


Eonnnke for the Rates of \fortaUty on the Assumption of Uniform 
i)l\trihniions 

W“Ifenilerrs Method of Stating the Assumption of a Uniform 
Di'iiriliution of Deaths Over the Year i>f ;\ge in a Population 
of 'I'ype 1; and the Resulting Formula for^ in Terms of P and 


/>, 'logi'fher with llis Short Proof Bast'd on a Lemma . 75 

Wolfemh-n’s Similar Method of Stating Uniform Min'emenl 
Over the Year of Age in the 'I'ype III Population; and the 
Resulting Formula in 'I'erms of P ami I), and Its Short Proof 

Based on the Lemma.76 

Woifemlen’s Similar Method of Slating Unif«»rm Movement 
Over the (’alemlar Year in the Type 111 Population; and the 
Resulting Rapid Pniof of the Formula in 'rerms of A’, P, and 
/7, and Its Short Proof Based on the Lemma.77 

The Pnntieijl Applimtion of the Pmedin^ EormuUvc 
The Xeee^sitv for Distinctive Methods at Infantile and Ifighcr 

Ages . ■.78 

.\|»|)licaliility t)f the Varanis Formulae without .Assuming Uniform 
Di^trilnitituis; 'Fheir Use by the RcgislrarS'Cieneral of F.ng- 
land and Wales, in the U.S. Abriilgi d IJfe 'Fables 1919 20, by 
.Moiiyama and tireville, and by the U.S. National Otlice «)f 

\ifal Statistics.’.79 

.\p|ilicability «)f the Yarious Formulae Assuming Uniform Dis¬ 
tributions .80 

VI1. 11 IF CO.NS I RUCTION OF MORTALFFY FABLFS FROM 
POPUL.VFION STA'FISTK'S 

Preliminary.81 


(1) { onstrnrtion of Mortal it y Tables from Death Returns Only 
Basic I’rinciph-; Halley’s Breslau and Price’s Northamp¬ 
ton 'Fables; Roman Fstimates of Macer ami Ulpian; 
(Iraunt’s Work on the Dmdon Bills of M<»rtalily; Bar¬ 


ton’s ami Wigglesworth’s Tables in U.S.82 

(2) ( onslruetion of Mortality Tables from ( ensus Returns Only 

Basic Principle; Ditliculties of Necessary Adjustments . 85 

Meech’s 'Fables in U.S.; Hardy’s, Ackland’s, Meikle’s, 
and Vaidvanalhan'.s Indian Tables.84 












i able oj L ontents 


XV 


SKCTUlN* I'^KAr.KAl’Il 

( 3 ) i'onsfnittioN of Mortality TaNes front Drath and i'cnsns 
KolMnis—Sufiplt nirnteti trequcutty hy Hirt/i Kctnrns at the 


Infantile .Ij^vv 

rrinciplrs; IV nf One i*r 'I'wn ('rn>ii*ifs; DilTi'mil 
lV(i])k‘iiis at Infaiitik', Atiull, Oidc^i, and Juvi'iiik* 

A.u^s. ... S.S 

(A) Infantile .{f,es 

DitVinilhVs . . 1 % 

Kn^lish Lifo ’I'ahle Mrtliiuls from liirlli and Doath 
Kolinn.-^ Only pM^raniinaliV rn‘M'nlalii»n: S7 

1‘arr^ Method . . . . SS 

M»*lliodsof Knj^li.sli Lifr'lahlrs .No.^. .S Id S 8‘) 

Pi'll’s Mrlhiii] .... . ‘^) 

Mflhods of Ilnj'liNh Lifr I'aMr** \os. 0 and 10; Qiiar- 
Irrly and .Muritlilv Ralrs; Drl'mlr's Alirid^ni 

Mrtliod f«»r Monthly kati^ (in I'.S.) ... ‘>1 

Mrtliods in I'.S. and ('anad.i, from Drath Rrtiirnsand 
Corrrrlrd Pti|iulali(ins or Hitths; 

Cilovrr’.s Mrth«»d (following (V.uIht) in I'.S. l.ih* 

'I'alihs ISOO 1010 . .02 (i) 

llrndrrxin N Mrtliod ... 02 (ii) 

Mis-, hdudrav’s Mrtlnwl in 1' S. .Muid^id Lifr T.iMrs 

1010 20 *.^ 02 (iii) 

(jfrvillr’s Mrllitid in I’.S. Lifr Tahirs 10.^0 1| . 02 (iv) 

(ira<Iiiati<ms liy Modilit'd .Makrhafii I'ormidar 0.? (i) 

Kormular for l.j ... 0.< (ii) 

'Tin- Orlrrmination of /i at 0, 1, and 2; (Jn-villr’s 

K'-limalionof/i„iii I'.S. la'fr’I'ah! *’! 10.^0 11 . 0.< (iii) 

.AdjustiiH'iil.s for Migration in I'.S. Lifr 'Tahlr.s lO.SO 41 04 

(0) .L/m// 

Ha^i(■ Trinrijilr^; A|i|irai^al of Ihr I’irMMil ('om|,arativr 
L'lililirs of hilTrrnil Mi’IIhmIs 05 

(i) ( irai'hir Mrtliod I'srd h\ .Milnr in t'.irliMr 'Ta 

l.-Ir and l»y Oihrrs Lit«T . 00 


(ii) Larrs First Mrllioii (in T;n;;lidi Lifr 'T.iMrs 
.\os. 1 and 2), and Sromd Method (in F.n^- 
li^h lafr 'Tahir .\o. ^ and Mralih\ Kn;dish 
'Tahli* .\t>. 1); ('omiiirnls of Orrmwood and 
Derric k ... 07 

(iiij Later Fai^lish Life* .Mrlhod-.; 'Talham’s Ttin 

riplr; Curve of Sinr-^ OX 

(iv) 'Tangential and ()-i*ul.itory kr|irodu( int^ IntiT 
|•olation Formulae on Sjir.i;;ur’s Frincij.lr'-: 

(irneral Prim iplrs and In.'-lanrc*-. of .\|>|ilira 
tion hy KiriK and Others ... 00 

Sprague’s Fifth DilTerrnrr O-iulatory For¬ 
mula; S|iraKUe*.s ami JJdstonr’s I'roofs; 
Alternative Forms; Reilly’s (irnrrali/a 
liems; OuchaiKin's Formula.KN) 





xvi 


Table of Contents 


SECTION 


PARAGRAPH 

'I'he Karup-fCin^ Third Difference Tan¬ 
gential Formula; Karup*s, King's, and 
Lidslonc’s Proofs; Alternative Forms . 101 

Shovelton’s Six-I^)int Tangential Formula 102 
Henderson's Six-Point Tangential Formula 103 

(v) Tangential ami Osculatory Reproducing For¬ 
mulae on Henderson's Principle: 

Statement of the Principle; Henderson's 
Ai)proximately Osculatory Formula . . 104 

Henders<m's Accurate Diffeience-Equation 


Solution.105 

Jenkins' Fifth Difference Osculatory Re- 
pnxlucing Formula.106 


(vi) Tangential and Osculatory Non-reproducing 

Fornuilae on Jenkins' Principles: 

Explanation of the Princi])Ies; Jenkins* 

Fifth Difference Osculatory' Formula; 
Lindsay's Alternative Demonstration 107 

(vii) 'I’he (leneral Problem of Determining langen- 

tial and Osculatory Formulae of the Rei)ro- 
ducing and Non-repn)ducing 'fypes: 

(Contributions to the (kmeral Thi'ory by 
Reid and Dow, Kerrich, (.Ireville, and 

Schoenlierg.108 

Reid and Dow's (ieneral Fifth Difference 
Oscillatory Non-reproducing Formula . 109 

Kerrich's M«»re (leneral Appniach; the 
Pseiulo-Analytical Method of (Jrailuation 110 
(Ireville’s Most (leneral Expression for Any 
Order of (\»ntact, ami Particular For¬ 
mulae Satisfying Predeterminecl Require¬ 
ments; Exiimples of His New Formulae. Ill 

(viii) Methods <if ('aleulation, ami 'rreafmenl of the 
End 'Perms, with 'langential and t)sculalon' 
Formulae; and the L’ni'ctual Interval ('asi*: 

'Pile Everett and Linear (’ompouml Meth- 

*)ds of (’alnilation.112 

End Values Handled by: Unsymmetrical 
Formulae; Assuming Missing Differences 
('onstanl and ('ompleling Difference 
'Pable; I lyi tot helical Extensions; Continu¬ 
ing Inter|Mdalion with C’onstant Differ¬ 


ences; Sjierial Interpolation Curves 113 

AcklamPs and Reilly’s Investigations of the 
Uneipial Interval Casi* for R»-|)n)during 
Formulae on Sprague's Principles . 114 

(ix) RepHulueing ami Non-reproducing Interpola¬ 
tion Minimizing the Mean S(|uare Ernir in a 
Specified Order of Differences: 

Basic Principle.115 








SECTION 


Table of Contents 


XVll 


PARAGRAPH 


I)c Forest’s Assunijitions in Minimi/iiiK the 
Mean Square Krmr of a Linear Coni- 
IM^uiul; Ilis Miniini/ini' of the Mean 
Square Krn>r of \Vol fen den’s and 

(Ireville’s K\|iositi»'ns 'd the 'rheor>' lift 
(ireville's Linear ('oinpoundiiii; (\'etlieients 
for Keproducini' lnterpolati«)ns on He 

Forest’s Assiimptii»ns.lift (i) 

(Ireville’s ijiiear (’oinpoimdiiiK ('oetlieients 
for \on-reproilueinK Interpolations on 

I)e Forest’s AsMiinptions.lift (ii) 

Ikrrri’ Assumptions in Minimi/inK the Mean 

Square Krnir of A'*^ 117 

The ResulliiiK Linear ('ompoiindinK C’o- 
etlieients for Repn>dueinj' Interpolations 117 (i) 
Beers’ Linear ('onipouinline; ('oetVieients 
for Non reprodueiie; lnter|iolations . 117 (ii) 

Neeessary Preeau! i«»ns.117 A 


(x) White’s“Jnterlt»ckin.u[" Interpolation Formulae IIH 

(xi) Repniducin^ Suhtahulation Minimi/in^' the 

Sum <»f the S(iuares of All DilTen nees of 
(liven Order l»y DitTerenet* K(|uation 
Operations; SimuiI's, Vaiijdinn’s, and 
(Ireville’s Inve'^liKalions; Practical Limi- 
t alums.IP) 

(xii) Pivotal Values: 

Kind’s Hade Principle; His Formula Lsin^ 
Ordinary Interpolation fnuii 'I'hree 
(lnju|is; Kin;t’s Proof; Applications in 
F.n;'lis|i Lift' 'lahleN Nos. 7 to 10 atul 
Klsewher ... . . 120 

De Forest’s (leneral .Method of Deriving 
Such Formulae; PnM)fs liy Mis Methitd of 
Kin:;’s roriiiula, I leiidiT-voii’s Formula 
Ha^ec| on 1 (Iroiips, and Kiii;'’s Later 

Formula with (Inuips.121 

Buchanan’s Suj'K«*’'tiiiii of FsiuK O'culatory 
Interpolation for Pivotal Values; Resulting 
Formulae Based uri Sprai'ue’s Repnirliir- 
iuK .Sih Oilfereiue, Jeukiii-’ Non n-pro- 
ducin;' Itli DilTereuie, ; ud Ji-nhin".’ Noii- 
repnMhHin;'.''th DitTen-m e Formula(‘;('om- 

jiarative Ri mils.122 

.\ppliratioiis of tin* 'riH-iuy of Linear (’om- 
Ijoiindin^ and Redintiimof Mean Sqiiare 
Frror; llie J.L.A. Ih ^t Fittin;' .kd Di j'ieir 
Leas! Square-i Formula, and Its L.si; in 
U.S. Life Tallies P/W-41; .Nere.'idty for 
('»>mproiiiise lji!tw«rn Best Fit for Pivotal 
Values and Smoothne'»s of Subsiatuent 
liiter|)olations Su^geals (Iraphic Deter- 






XVIII 


Table of Coulents 


-■ I r IM )N’ PARAGRAl'If 

mination nf Pivotal Values flllustrated by 
Jenkins), and Reid and Dow’s Pn)|)ns.'d . 12.? 

\iin central Pivotal Formulae; IVs in 
\ew Zealand Life 'Pables and L'.S. 

Abrid^nl Life Tables 191^-21); Other 
Meth<»ds of Finding; Fiul Values; (Mi>se 
Results of Determininj; Values from P 
and A) Sejiarately or Directly fnun (/ . 121 

fxiii) ('urve-fittins Methods: 

(a) Aiiplications of Makehain's Formula by 
Keit/, Forsyth, and (Ireville; Determina¬ 
tion of Makehain ('oiislants To Repro¬ 
duce Annuity Values Closely ; (/O Hardy’s 
Cse of Makeham’s Seci)nd I'ormiila; (r) 

Trarh ten berth’s Modilirations of (iom 
pertz’s Formula; (//) SuKKesli«»nsof llanlv, 
Puchanan, LiMstone, IVrks, Wiltslein, 
StelTensen, and Others; (e) Apj'lirations of 
Pearson’s Frequency C'urves in Indian 
and F^^'ptian'rallies; (/) llanly’s, Prown- 
lee’s, and StelTensen*s Curves for or c^; 

(f() (iradualions by Reference l«) Siamlard 


'Fables; I’simI for Indian Data .... 12.S 

((’) ami ''Jmnile' hcturcn Infuilik and 

Adult /l/ir-v 

Reasons for Special Methods; Typical (’urves «»f 
and///, etc.120 


Dillicullies at the Oldest A^es; Kinj^’s InlerpiilalioFis 
in Fn>(lish Life 'Fables; Parford’s Use of Newton- 
Sheppard Ailju.sted Differences in Australian 'Fables; 
(iomperlz (Graduations in Kind's ICnidish Life'Fable 
x\o. 9 and in No. 10 and Scottish 'Fabli*; Makeham 
(Gradualion in 1920 .Northern Ireland 'Fables; Wilt- 
steiu’s Formula in C.S. Lib* Fable.-. 1S9(I, i-tc.; Ih n- 

dersoii’s ('<impari.sons. ... 127 

Rapitl ('han^'cs at juvenile A^es; IntiTpolalioii'-. in 
Fnj'lish Life Tables; Special Nlethoii in Foolish Life 
'Fable No. 10 Jiecaiis** of .\bnormal \’arial ions in 
Pirlh Rates; Special Methods Su^X'^esied by .\i klaiul, 
Puchanan, llendersoiq ami (Greville.12S 

Vlll. TIIK (’(INSTRUCTION OF APRIDCGKD LII F FAPLKS FROM 
P( )PU L.VFK )N S'F.VF1.S'FK’S 

Nature of the Problem.129 

(i) Dr. Farr’s Method; Its Defects.L?0 

(ii) Dr. Hayward’s Method; Illustrations by HaywanI and Dun¬ 

lop, and J.I.A. (’omparisons.l.?l 

(iii) (ii*orKe Kind's Melhotl; Detailed Descriptions in L'.S. 

Abriilged Life Fables 1919 -20.L?2 

(iv) Method (Given by Kditorsof J.I.A.1J3 

(v) Reed and MerrelFs Method .134 









TM' of CoHtaiis 


XIX 


s?crioN rxKAC.KAPir 

(vi) K. C. Siu)\v's Ulustration< in 75tli Ri*|M)rt of Ri'j'is 

trar C'n‘»'oral <»f Kiiirlaml aiul \V.il»s.I.?5 

(vii) run ion hy RrfiTi'iUT to a Staiulanl Tahli*: (u) Mi*thoiI 

(livin in j.i.A (!•) Mrlhod rsiul in Annual 

AbrH!«;«*il Liu- 'raI»l(‘M»f Xatimul Oilia* of Vila! Statislirs l.<h 


IX. MKTHODS or COMPARIN.; TIIK MORTALITIKS OF 

I)Iffi:rf.nt(’()Mmi\itii:.s 

('onijtari-^in'^ of nr ' .1.^7 

\aturi‘ an«l I)l•lVn^ of llio ('riiilo IValli Rali\ llu' IJfr Tahir 
Oralli R.ilr, ami l-'..\(>rrl.ilions of LiiV.IAS 

Tlir OiiiM lly Slaml.inli.’nl Oralh Ralrof ihr Rr^'isIrar-Oi-nrral of 
F.ii.nlaml ami WaK^; M.ilr. Frinalr, ami Persons Rales. l.W 

Oi-iilrrala in llir Slamlanl Population; ihr RrRistrar-OriuTars 
( omparativr Morialily Imirx; Xrrrssiry Prrraiilions 140 

Thr l\i-;'i'«lr.ir ( Ij'iirrars liulinrt .MrthoilofSlamlar(li/ation;\Vo]f 
rmli n’< 1 JriiiiMi'^liatioii^of llir ('oiwi^lrncyof llir Male, Frinalr, 
ami P< Forinul.u*: Limitations of tlir Mnhotl; Its Relation 
|o ( onip.irisoM" of K.iliix of Anna] to I'Apeeteii Deaths 141 

X. TMF. FORF.('ASri.\i; OF .MOR TALITV R.NTKS 

(lem-ral Prineiples.142 

0; Rates Showiiij' Vatialion for ('onstant A^e .\iTorilinK to Year 

<»f Ohservalion . . . 11.^ (1) 

f.?) Rales Sho'\in): Variation Xtionlini' l»» .Xj'e for a Constant 

X'ear of OliM'Fvatitm .LLi (2) 

{M Rale- .Sho,\inii X'aiMlioii '«eonlinj' li» A^e ami V»’ar of Oh 

MTN.ilion (the (icneration Melhoil) 14.< M) 

Fmee.i'liij.i; 1)\ Fxli.ipol.ition from the Rates of (1) ami f.^) . 144 

Applie.ilioii- lo population Miirtalil\ Methods lllustiated hy 
Kiiilih',; Derriek: Kermaek, MeKemlriek ami MrKinlay; 
t Ireen.'.O'Ml, Rht'de-,; (Varner and XX'oM; ami P«»II.ird 14.S 


XI. MOR I XLH V HV (\XCSF. OF OF VPIf 

I tevelopim III of till* International Slati>li{al( la.svirieation of Dis- 


e.i-es. Injuries, and Causes of I)ealli.14(» 

'I'he IV'idilems of Det'inini; and Tahulatini: Imri'.-iliate and Coii- 
trihniory Caii e> ... ... 147 

Method-'if FAhihilinj: Rates of Mortality h> ('an e 14.S 

1 he Con-linilion of Mortality 'I’ahles Analyzed hy Cannes of 
I 'i.ilh .Xletlioils and Tahles Prepared hy .Nathan, the Xational 
OiMe of \ ifal Slati.-lii-i. Wy^-, Karri, ami tire\ille . 14^ 

.Mortalilv Iahle- .'showirn' the F.lTiOsof Llirninalint; a Partieular 

Ca-J-e I f Death.LSO 

.Xlethods Sut.';:e'-ted for Measurin;( the Importance of Different 
( au.sesof Death Rased on Potential Years of Life I/)st . . . 151 








XX 


Tabk of Contents 


SW:Tlf>N PARAGRAI'n 


XII. 0('('UP.\'i r()\AL MORTALITY 

'i lie and Saillish Occupalional Mortality Studies of Farr, 

()^,dc, 'I'alham, Slcvens«>n, St(»cks, Wair-Oiinynghame, and Dun¬ 
lop, ami Knropcan Studies.152 

DiMirullms in Oalrulalini' and (‘«iinparinK Orniiialional Mortality 

Kales;.153 

The (la ssi lira lion of Oecii pa lions.153 (i) 

DMTepanc ies hrlueen iK'-iKnalions on the (Vnsus and Death 

Cerliliralions.153 (ii) 

Detennination of the Pojiulation at Risk in Karh Occupational 

(ln>iip.153 (hi) 

Physical Fitness, Fnvir«)ninent, S<iciaI-Fcon«)mic Status, etc., 

in Relation to th»‘ Oceupalion.153 (iv) 

Methods <if ('oniparin^' the Mortalities of Different Ocaipations 153 (v) 


xni. Till-: LISF OF OFXSrS and RKIIISTRATION data in TDK 
('OMPILATION OF .STAPISIK’S RKLATINO TO MAR- 
RIAOFS, hIRTIIS, ORPHANHOOD, UxNKMPLOYMFNT, ETC. 


Preliminary .154 

( 1 ) Mtirruii\rs 

Rates <»f Marriaf^e.155 

Doulile Derrement Mortality and MarriaRc Tables . . 1.S6 

'raliulalitins of Marital Status, and Relative Agi‘s of Hus¬ 
bands and Wives.157 

Mortality According to Marital Status.15X 

Rates of Widowlu»od.1.S9 

'Phe Imjiortanre of Marriage Statistics in Populatam Trends 160 

(2) l^irlhs omi b'erlilUy 

Pn)babilities or Ratesof Is«ue by S<*x and Age .... 161 

Directly and IndircTtly Stanflardi/xsl Birth Rates . . . 162 

British and U.S. (Vnsus Data Resj)eeling Fertility . 163 

(3) Dependency and Orphanhomt 

New Zealand and British Depemleney Data.164 

U.S. Statistics on Family C'omp(»sitM)n.165 

(4) Vnemployment 

U.S. and C'anadian Cen>us Queries re Employment. U/i 

XIV. THE THEORY OF REPRODU(TlVTTY 

'Phe Reas«ms for Constructing Measures of Repnaluctivity . 167 

'I'he Female Oro.S'. Repmiluction Rate, and the Corrcst)onding 

Male Rate.168 

'Phe Female Net Rc|irodurtifin Rale, and the CorresiH)nding Male 
Rate -and the Net Fertility Function.169 


'Phe Practical Limitations of Such Theoretical Reproduction Rates 170 















Table of Contents xxi 

SFi’n«)N' PARAGRAPH 

The Danj^iTsnf Rqirmluclum RalfMis M»'a<iirrs of Kuliirt' Popu¬ 
lation (Jmw .Ii .171 

'Mil* lia.sic (’t»nllirl Iwlwcm Male and IvinaK* \i*t Rrpn)duction 
Rati'S, and Matlipinatiral Fi^rniuialion; tlu* In\»’shValH»ns 
of I^»tka, RhiulcH, Kiifzynski, P«»llard, ami Kaniu‘1 .... 172 

Ro|ir<»du( li«in Ratos HaM-d on Moro Kolitud Rirtli Rates; Meth¬ 
ods ProjioM'd by ('lark and Dyne, Wlielpton, l)e|)oid, and 

Woiifter.17»^ 

The Inlien'iil Rale«if Natural Inereasi':'I'heoretiral Korinulation 
and Praelieal ('aleulation; the 'riu'ory of the Stable A^e Distri 

billion.174 

Relationships between the lIy|K)thi‘tieal Life'I'able Population 
and the Tliioreliral Po]-nlati»m of Stable \»e 1 >i.slribiition In 

eirasin;^ at Rater. 175 

Mi)r«‘l■■.laborale lnli< rent l\ale< id Natural IniTeasi* .... 170 

'riie l\e| laeeinent Index.177 

Limitations Surroundiiiir the Theories of Re|iroiliielion Rates ami 
the Inheii-ni Rate Ilf Natural Im rea^e.17S 

XV. SK’KNLSS 1)\TA 

The Importanee and Praeticability i>f Stati.^tieal .\naly.si‘s of Siek- 

ness Rnmds.17*) 

Defei ls in Siektii 'S Data Obtained by (ViiMis Methods from the 

(leneral Po|.ulation. ISO 

History of liiitidi and T.S. Methods of ('olleetiiiK Sickness Data 

by ('oiitinnous. Re'.'i.stratioii ISl 

'Phe .M(»del Stale Law and Notiiicalion blank in the L.S., and the 
Develo|.ierni of a St.indai'il Dia;.'no.ds Coilr in the Inleinalioiial 

List.1X2 

'Pile Limitations id Sii hne^s Stafi.-lies . 1X5 

XVI. ( ()N( LI SION ... .1X4 


soMK Tiir.oRY IN Till-: sampllm; ok human 

POPl'L.\TlO.NS 
Hy W. Ldwakds Dkmivi;, Pir.I). 

SI i l loN 

1. Phe Reasons for Sampling? 

2. L'se-i Ilf Sample Survey.s 

.L .Siinie of the of Sampliu!' in C'onnei*lion with ('ensuses of 
Population 

4. Dehniiions: Frame, S.miplin^' I'liil, Probability Sample, Ksli- 
inate. Standard Krror, bia^, Population, .Sample Doif'ii 

5. 'Phe Aim of Sample De.si-rn 

b. Random Variable.s; Random Nuniljers; 'i'he Mean fif thi: Uni 
versi* (fj), and Its Variance (a’), Standard Deviation (^ 7 ), ami 
('oellicientof Variation ( 7 ) 













XXJl 


Table of Contnils 


sK«;ric»N 

7. KuiicI;imt*ntMl 'riii<»rcms: Tin* Meiiii nf iho Distribution of the 
S;i!n|)l«* Ml :iii x\ Xarianre nf thr I)i-;tril)Ution of x (tr^) When the 
An* Marie uiihnut lNej>la(*enienls; The Finite Multi- 
filirr (.\ — - 1); 'I’he Slamlanl Frn>r of the lOstiintitc 

A'frryj; 'rin- (‘nrir.ilenf nf X'ariiitinn (<.’7); 'I’hcr 2 Celled Universe 

S. Sin'.'Ii;-StaL,'i' S.inii»lint( 

*). Sy'tianalie nr Pallerni il Samplini' 

H). The Apprai-al nf PriM i-iinii 

11. 'I'wn Sla^^e Sanipliii;^ 

12. ('alibratimi Saiiiplis 

12. Si- fifiiri San.pliii.ir: Outline nf the 'Ther^ry; Optimuni Allora- 
tinii; Ihnpnrlinnalt* All'eat inn 



The Census 


5 


Ages, while of course not census enumerations in the nio<lem sense, are, as 
P. Granville Kdf^e has observed (“V’ilal Reiiislration in Kurope,” J.R.S.S., 
XCI [1928], 348), of considerable si ienlilic interest and importance; and 
in addition to the Domesday Ihmk he cites "the poll-tax returns of tlie 
14th century in Kn^land, the Dutch tax reidsters of the 14th century, the 
‘£tat des Subsides’ of 1328 enumeratiin: the numbers of ‘feux’ or hearths 
in the territories of Philip VI of France, the ‘Denombrerneiit de la 
Prevostc ct C'hastellanic tic Pontoise’ made in 1332 in order to ascertain 
the dowry of (^)ueen Jeanne, the enumerations made for purposes of ad¬ 
ministration in certain German cities durin;^ the Middle A;.',es, and at the 
end of the 17th centiir}' the French reconls of the 'Intendaiils des Pro¬ 
vinces.’ ” 

'Flic earliest statistical eiupiiries in Ihibylonia, ('hina, and K}:y|)t, to 
which Knibbs refers, ai)pcar to have l)een baseil on ])artial enumerations 
only, covering' heads of families, taxpayers, males (citizens and slaves), or 
men of military ai;e, and were undertaken, like those amoni^ the Hiblical 
Hebrews, in Rome, and in other similar enumerations conducted in 
Greete, specifically for taxation and military purposes. 'Phe Fiiropean 
records of the Middle Ai^es were likc‘wise incomplete, althou^di surveys of 
hearths, tax retiisters, and lists of citizens becaim* more fr(M|uent, and some 
authorities sui'^'cst that a com])lete ])o]>ulatioii count was taken in 
Xiirenlier^ in 1 bP), 

Heloch and Italian investij^atnrs have drawn attention to the re-mark¬ 
able series of population record.s, «*ompiled usually by the]»arish priests and 
reported to the ;j:overnment, for the various .states of Italy and Sicily in 
the loth, 17th, and ISlh c'enturies some of them coinjilete censuses for 
each sex separately, and even by aLie;»roups IS 50 ami over 50 for males 
which rc*|)ose in 1,116 volumes in the archives of Naples and c*lsewhere. In 
Spain, also, as is pointed out bv A. H. Wedfe (‘•|*»>|)ulalion (Vnsuses be¬ 
fore 1700," j.A.S.A., XXVll [1032], 357), “we have for the lOth century 
richer statistical material than for any other country in Furope with the 
exception of Italy,” althoui'h there was still no reliable enumcTation of the 
whole po]>ulation. For France, Fd^e (loc. rit.) has emphasizcMl the fac t 
that “clistin^uished admini^^trators and men of science had |)ersistently 
advocated re^iular census c-numerations ]>robably earlier than in other 
Kuropean countries”; Mic prac tical results, however, wen.* fra^unentary. 

Knibbs, an eminent authority, has stated (lot\ at.) that, followin;' the 
partial and irre^^ular nreordin^s of the Middle A”es, “the* credit for the 
revival of systematic enumeration bclciii^s to the Canadian Province of 
(Quebec, or La Nouvelle France as it was then c ailed, Iwhc-rc.] a census 
was taken in 1666”—doubtless under the stimulus of the French adminis- 




I 


INTRODUCTORY 


SropK (IF Sti-hy 

1. Till’ sl:ili>ti('al df colIiYtive si inly nf luinian life, or, as it is called, 
the science of Denmeraphy. may he consideretl as incliidin;' the following' 
subdivisions (see “Vital Statistics,” hy (i. (\ Whijiiile): 

(0 (lenealoL-y, wliiih deals with indiviiliial ancestries and personal 
records. 

(2) Human lai'^enks, which investigates heredity from a scienlilic 
stand[>oint and is to a laree extent the application of statistical method to 
;:encalo;'V. 

iM 'I'hc* (Vnsus, that is, the ci»llection of jihysical, social, political, 
reli;:ious, and educational faits concernin'! ])oi>ulat:on, usually hy the 
method of eovernmental enumeration. 

(4) The Ueidslration of vital fads, such as those concerning birth, 
marria;:e. divon e, sii kness, and death, usually under ^M)vernmental direc¬ 
tion an<l by use of individual records. 

Vital Sl.iti'^tiis, i.e., the ajiplication of statistical methods to the 
study of ihc'^e fads. 

(6) biometrns, which includes anthropomdric studies of human 
jTovdh, slalnre, strenelh, etc. 

l7i ralhonu’lrics, or slatislii al patholoey, which includes the detailed 
stinly «>f di‘-ea.-es and their relations to the human body. 'I’hese facts are 
obtained lar'jely in hospitals, by health department lal.'oratories, and by 
life insurance comjtanies. 

'I'he followin:^ stinly will be ilevoted entirely to (4), (1), and (5»j above. 

VM.ri-: OF .S'ii’DY OF Vir.M. Siwnsnc's 

2. 'I'he ('ensus is a national inventory, and is required ]>rimarily for 
the adjustment of rej»resentation in le;»islative bodies. 'I'lu* information 
^.•alhered is rd-o of L-reat value to commerce and indu.siry, and in many of 
the admini-itratixe problems of povernment.s. Ree.ist rat ions of births, 
marriai'es, and ilealhs are essential as a means of preserving' evidence of 
personal statu.s, in onler to establish the individuar.s identity, a^e, 
citi/.enship, and marital ('ondition for the ])roper determination of ihc 
various rights and obli^jal ions whii h arise in connection with life insurance, 
pension plans, social security, employment, passj)orts, military service, 

1 



2 


Population Slatistirx and TJmr Compilation 

public assistance, etc., and in llie settlement of estates and inheritances. 
MorcfAcr, when taken in conjunction with the census tabulations, they 
are of ^rcat statistical importance in many social investif^ations and enter- 
I)rises undertaken by .L^)vernlllental or private agencies on both national 
and local baM*s, and lluy frequently provide invaluable material for the 
guidance of piiblk- health antliorities as well as for the construction by 
actuaries (»f mortality ami other tables based on statistics of the ji^eneral 
])opnlation. 

'rill! desirability of a comprehensive system of social statistics is well 
stated in the hilhiwin:' fjuolations which are •{iven in a booklet “The Story 
of the Ctiisns, 17*^) 1*)16/’ published by the Tnited States Onsus 
bureau. In a s]K*ech delivered so lonj' ulto as 1X60, in connection with a 
bill to |)rovide for the lakiie.f of the Ninth (Vnsus of the United States, 
President James A. (larlield said: 

“'riie developments of statistics are causinjj history to be rewritten. 
'I'ill recently, the historian studied nations in the auj^rcL^ate, and j^ave us 
only the story of j»rinces, dynasties, sic'^es, and battles. Of the ]>eo]>le 
theinselxes the yreat siuial IhmIv, with life, jrrowth, forces, elements, 
ami laws of its own In* told ns nolhin'r. .\ow', statistical inquiry leads him 
into hovels, honies, workshoj»s, mines, fields, i»risons, hospitals, and all 
other phnes where human nature d^ilays its w'cakness and its strength. 
In these exphirations he disco\a*rs the seeds of national .erowth and decay, 
and thus bei'omes the proj>ln‘t of his generation. 

“'riie chief instrument of .\merican statistics is the census, which 
should ac('om])lis}i a twolohl tibjecl. It should serve the lountry, by mak¬ 
ing.: a lull ami accurati* exhibit of the elements of natic)nal life and strenj'th; 
and it should si*rve the science of statistics by so exhibitin.^ I'cneral results 
that they may In* com]»ared with similar data obtained by other nations. 
'I'ln* census is indis|»ensable to modern statesmanshij).'' 

A sli'ditly clilTeriait phase is referred to by John Uummini;s (J.A.S.A., 
Xlll,f)()5): 

“It is true of every sort of so< ial chance, whether of jiroLTCSs or decline, 
that thc‘ stiq»s are impen eptibh* to the unaideil vision of those who, as 
leidslators or administrators, in the face of exist in;.: c'omlilions of inihiite 
complexity in their ori jn and iiiterdejH*mlem.e, mold public policy. To 
determine the direction and extent of these chaiiL'es reejnires the survey 
of a Icnii^ period of time*. It rc'cjuires accurate measurements which em¬ 
brace the full jletail of social ])henomc'na, and it is the ])roper function of a 
^reat statistical lalmratory, by a.ssemblini' the data of social phenomena, 
to make this survey, and by so doin.ir to extend the scope and powcT of 
vision of those who arc at anv «iven time dircctiiiLr the trend of social 




I nfroJiiciorv 


forces. In the records of such :i hihoratory tlie L^rowth of :i nation is epito¬ 
mized and in its current work tlie im]HTcep!il»le chani;es which are lakinj: 
place are accuratoK determined." 

4. The principal types of informatinn wliich are «il' value in most coun¬ 
tries fnmi the^e pj)inls of view are: 'The niiinhei* of ]>eopK\ and their 
distrihulion l>y residence, with the numhers classiiic«i as urban and rural 
and the dei:ree of concentration, a^e. m‘\, mariial condition, and »lata 
respectinj: fertility; oc<upation, industry in which employed, and class 
of worker; color or race (i.e., while, N'cl'to. Inilian, Chinese, j«i »anese, 
etc.); years ol schoolini; and or highest ;'rade of sih»>ol com|ileted; 
country of l)irtli; year of immi;:rafion anti cili/enshij) of the forei;.'.n-l)t»rn; 
nativity of ]»arenls; mother Ittimue arul ability to speak I’nidish; reli;:ion; 
mode t)f hou.''in;i; ein|>Ioymenl status anti earniu'^s; status iiiuler Stu ial 
Set urity leidslatitm; anti birth-rates, marria'^e rales, tlealh-rales, rates of 
widtiwhooil, prt)babililies of issue, statistics t>i tlepemlency ant] orphan 
liothl, anti rates tif unem|)ltiyment. 

'I'he collection of the statistks untler these sarituis heatlines is 
elTet tetl either by “enumer.itittn" or “ri*'-'isiratit)ti.” I’.numiTatitm is the 
prot e.ss useil in t ensns-takini:, when authori/.erl olht tas t ailed “tanimera- 
tors," chareetl with the collt't titm t)f the tiesireil inftinnalion, |n*rsotially 
visit the individuals (t»r other units) iti be enumeratetl. In tht‘ case of 
Keeistralion such as that t»f births, tleaths, atitl marri.e'es the desireil 
f.icts are reporttal to tlesii nalcii tilliters, iommi>nly talletl “repstnirs,” 
in accordance with prest ribeil regulations. 'The method t)f enumeratit)n 
lentls itM-lf tmly to perititlital anti rel.itively infreijiient en(|uiries, w'hile 
rei.rislralit>n is tlesi-.^ietl to set ure a ttmtimitnis rettinl t»f evetils. 



ir 


THE ct:nsus 

]|IST0RY OF CkNSUS-TAKINO 

6. The idea of countinji the number of people in a country is, of course, 
a very ancient one. Sir (leori^c II. Knibl)S, late Commonwealth Statis¬ 
tician of Australia, ^ives the followinji interestinj: account of early census- 
taking:* ‘ Thouj'li the practice of census-takinii, in some form orotlier, is 
proljably as old as any form of civilization, the institution now known as 
the ('ensus may be said, in so far as its scoi)C and a])plication arc con¬ 
cerned, to have been evolved only diirinj' the 19th century. We at least 
know that in Ihibylonia statistical inquiries were carried out as far back 
as 38(K) or perhaps even kStM) n.(\, whilst in Cliina enumerations of the 
peojilc took j)lace certainly as early as aluHit S(KM) n.c., and in Ki»y]>l in 
about 25(M) n.r. It is in)t without interest to note that the first lliblical 
account of an enumeration of the people is that r(‘fcrred to in the Hook of 
Kxodus (Kxodus, xxx, 12), where it is staled that Moses was directed to 
number the (.'hildren of Israel and to levy a poll tax, the assi^^ned date of 
this bcinj^ 1491 b.c. I'here are several other Hil)lical references to Censuses 
(Numbers, i, 1 3, and 47 49; Numbers, iii, 14, etc., and 14, .S4, etc.; 
I C'hronicles, xxiii, 3, etc.; 2 Chronicles, ii, 17; 2 Samuel, xxiv, I 9; K/.ra, 
ii, 1 61; Nehemiah, vii, 6 69). I'lic most notable of all these is, ])erha|)S, 
that carried out in 11)17 \u\ Ijy the Hebrew Kin^ David. Stran;.^c as it 
may appear today, there is ^^ood authority for believini; that the Hiblical 
account of the Divine wrath {1 (/hronicles, xxvii, 24; see also 1 (lironicles, 
xxi, 1 6), resultinj' from the action of I )avid in carrvin*; out this enumera¬ 
tion of the Israelites, ^'ave rise to the idea that the act of ('ensus-takini' 
was in all cases a relij*ious offence, aial conse(|uenlly had tlie efft*ct of 
delaying; the adoption of the ('ensus in Kurland for many years. A form of 
C.'ensus, taken every quinquennium for fisial and military purposes, was a 
regular Roman institution, an<l lasted from abcnit 145 n.c. until the sack¬ 
ing of Rome (a.d. 410). After the latter date . . . various works of a 
statistical nature, notaldy the Breviary' of (liarlemagne (a.d. 808) and 
the Domesday Hook of William the Conqueror (a.d. lOSO), were compiled 
in Eurot>c during the Middle Ages... These early records of the Middle 

* Sfc “The First Conimonwcallh Census. Mil .\pril, 1911— N'dtes by G. II. Knilihs,'* 
and the exhaustive “llistdrical Review of Census Devcloprnenr* on pp. I- 0, Vol. I, 
1911 Census of Australia. 


4 




The Census 


5 


Ages, while of course not census enunicrntions in the modern sense, are, as 
P. Granville Kdge has observed (“Vital Registration in Murope,” J.R.S.S., 
XCI [1928], 348), of considerable si ientilic interest and importance; ami 
in addition to the Domesday IhH)k he cites “the poll-tax returns of the 
14th century in Kngland, the Dutch tax registers of the 14th century, the 
‘fitat des Subsides’ of 1328 enumerating the numbers of ‘feux* or hearths 
in the territories of Philip VI of France, the ‘Denombrement de la 
i*revostc et C'hastellanie dc Pontoise’ made in 1332 in order to ascertain 
the dowry of (^)ueen Jeanne, the enumerations made for purposes of ad¬ 
ministration in certain German cities during the Middle Ages, and at the 
end of the 17th centiir}' the French records of the 'Intendants des Pro¬ 
vinces.’ ” 

The earliest statistical enquiries in Ihibylonia, ('hina, and Kgyiit, to 
which Knibbs refers, ai)pear to have lieen based on i>arlial enumerations 
only, covering heads of families, taxpayers, males (citizens and slaves), or 
men of military age, and were undertaken, like those among the Hiblical 
Hebrews, in Rome, and in other similar enumeralicuis londucted in 
Greece, specitlcally for taxation ami military pnr|)oses. 'Fhe Fiiro[)ean 
records of the Middle Ages were likewise incomph'le, although surveys of 
hearths, lax registers, ami lists of citizens became more free |uent, and some 
authorities suggest that a com])lete j)opuIalion count was taken in 
Niirenberg in 11-Pk 

Heloch and Italian investigators have ilrawn attentii)n to the remark¬ 
able series of population records, compile*I usually by lhe]>arish priests iwul 
reported t*) the government, for the various states of Italy and Si* ily in 
the loth, I7th, ami 18th centuries some of them complete lensuses for 
each sex separately, and even byagegnnips IS .St);inil over 50 for males 
which re|)ose in 1,116 volumes in the archives of Naples ami elsewhere. In 
Spain, also, as is pointed *)Ut bv A. H. \V*jlfe (“Population (Vnsuses be- 
f*jre 1790," j.A.S.A., XXVI111932], 357), “we have f*)r th*; l6th century 
richer statistical material than for any tUher country in I*air*»pe with the 
exception i»f Italy,” although there was still no reliable enunuTation of the 
wh«)le population. F«)r Fran**e, Kdge Utu\ ril.) has emphasiz***! the fact 
that “distinguishe*! a*lministrators ami men of sc'cn* e ha*l |)ersistently 
advocateil regular census enumerations probably earlier than in other 
Kuropean countries’’; Mie practical results, lujwever, were fragimailary. 

Knibbs, an eminent authority, has stateil (lor. r.it.) that, folhnving the 
partial and irregular rec*)rdings «)f the Mi*!*lle Agi*s, “the* iTe*lit for the 
revival of sy.stcmatic enumeration behmgs to the Caiuulian Proxince of 
(Quebec, or La Xouvelle France as it was then < ailed, IwluTi.] a census 
was taken in 1666”—doubtless under the stimulus of the French adminis- 



Country 


Usual Inter- 
censal 
Period 


Date of Most 
Recent 
Knumeratifin 


Dates of Last Three 
Previous Enumera¬ 
tions 


Present 

Custom 


Europe Continued: 

Norway. 10 

Poland. irrrKular 

PortUKual. 10 

Kuinania. I irregular 

Kussia(U.S.S. K.)“.. irrOKular 

Spain. . . 10 

Sweden**. 5 

SwitziTland'*. 10 

Yuj;«»slavia irregular 


Asia: 


Burma. 10 

tlliina*^. 

India. . ... 10 

Japan'® . _ irn j'ular 

Korea (R(‘|iul)lir). . 5 

Malaya ... 10 

Netherlands Kasl In¬ 
dies (U.S. of Indo¬ 
nesia) ... 

Palestine fIsrael) . 

.■1 frica: 

K^ypt 10 

Union of South Afri- 
ra*« . . 5 

Australasia: 

Australia'^ . ... irre^jular 

New Zealand*'* i 5 


I 


lished 


Dec. 1,1950 1946,1930,1920 . 

Dec. 15,1950 1946,1931,1921 . 

Dec. 15,1950 1940,1930,1920 . 

fan. 25,1948 1941,1930,1913 . 

jan. 17,1939 1926,1920,1897 . 

Dec. 31,1950 1940,1930,1920 . 

Dec. 31,1950 1945,1940,1935 1860 

Dec. 1,1950 1941,1930,1920 1880 

Mar. 15,1948 1931,1921,1910 . 

Mar. 5,1941 1931,1921,1911 . 


Mar. 1,1951 1941,1931,1921 1881 

Oct. 1,19.S0 1946,1945,1944 . 

May 1,1949 1940,1935,1930 . 

Sept.23,1947 1931,1921,1911 . 


Oct. 7,1930 1920 . 

Nov. 8,1948 1931,1922 . 

Mar. 27,1947 1937,1927,1917 1897 

May 9.1951 1946,1941,1936 1911 

June 30,1947 1933,1921,1911 1911 

A|»r. 17,1951 194.% 1936,1926 1881 


“ First ctiiiiiilrli' (rn-.iM in 1K07. .\ trn^n< tiikrn in 1937 was stated iii1iriall> to lie unscientific and in- 
amiratt'; the .srni'dnlrs were ilt.stniyi’il uml a iii*\v t-ciisiis {irdeml. 

I* First irnsiis 174U; (rirniiially (with three omi'^siiins) to 177.s, i|uiriquennially In ISM), and thence de¬ 
cennially; i|nini|iii'niiialiy nnw; “ciiuinrraliiin'’annually 
o (Vnsii. in 18K8 instead «jf 1890. 

F'iTmal cffisiises nia«lr urily three times in t'hinese history (in 1910,1912,and 1928); all, however, were 
f.itilty. Recently a census has been urdererl as uf June .10, 

First aetiul eniimerati.iii in 191.1 

>*The toil anil 1911 ren.suses rmered the Fiimiienn iNipiilatiim only. Censuses were made in 1904 in 
each of the four Ciiloiiie.s uliiih were iriiorimraUMl as the I'nion in 1910. 

Prior to feileralinii uf the .Anstrali.in Slates, caili was resimnsible fur its ow’U census, the earliest of 
which was that of New South W.iles in 1KJ8. 

First census in IK.SR; irregularly to 1881. The quimiiiennial census iliie in 19.11 was abandoned as an 
economy riiea.surc. 


Stati.<itics in Kiiropc, 1918-19.19; An Annotated Bihliography” and the “1940*1948 
Supplcnii-nt** thereto, “Population Index*' (quarterly publication of the Population 
Association of .America), and tlie valualde series of Statistical Handbooks of the 
League of Nations (published by tbc Health OrganiKition of the League, and com- 
prLsing 14 volumes f*)r T’hc Netherlands, Belgium, England and Wales, Spain, Austria, 
the Scandinavian Countries and tbc Baltic Republics, Portugal, Czechoslovakia, 
France, Hungary, Ireland, Switzerland, Scotland, and Canada). Reference should be 
made to these sources and the current publications of the various census offices for 
detail, and for flata respecting countries not included in this table. 


























The (’/ n^us 


9 


porlant countries. 'I'ho regularity which prcvi«)usly had him established 
in the census proceilures of (ireat Britain and nuiny of the Kuro]X'an 
countries was interrii|iled by the War. Reeisi rat ions of the po|)ulation for 
the pur|>ose of i*^siiin;: ration books were undertaken in several countries 
durin.^ the War, but they .ire not shtiwn in this table which is intended 
t«) idve a record of actual censuses only. 'I'he irre:.'ulaiilies in the census 
dates of many of the Latin .\nuTh an countries are likewise ilue partly to 
unsettled ])olitical conditions. 

(il M.UAt RkINCIIM.I S OI- Cl.NSl S rAklM’. 

iS. \t an International Statistical ('on^ress in St. I'etersburr in 1S72 
(see J.R..S.S., XXW 11S72|, Idl 57, and Dudliel.l, J.l.A., XXXV, Ml) 
an attempt was made to (‘stal)li>h .‘Hinne dtiTce of unitormity, wliii h was 
believed to be desirable, in the conduct of the census in the various coun¬ 
tries. Repn-.senlalives were sent fnan the chief Kuro|K*an nations and their 
<le|)enden( ii'S, and from liie I'niled Stales. Lhi* most im|)ortant decisions 
and re« ommendali«»ns of the (\inen*ss were as follows: 

11 1 'Lo avoid mi*^understandin:; and wronjf use of terms, it is necessarj' 
to rec oLOii/.e: 

(»/) 'riie i/r fiiilo or |)resent tiopulation i.e., the whoh‘ number 
present in the |»k'n e where and at tin* inoiniMit whiai the i ensus is taken. 

\h\ 'I'lu' po|>ulalion of habitual re^deiu e, or “domic iliatc'cr’ jiopula- 
tion tnsually called thc‘c/c //ov poimlation) i.e., the population whose 
habitual residc'in *• is in the idace when* the census is taken (includes 
those tc-mporarily absent and e\c lnc|c*s those* whc» .are only tem|M)rarily 
presc*nt 

ui The lc*'.:.al population i.e*., persons whose* lc*'al residence is in 
the plac e where the ca iisns is l.iken and who are re;.'i'>teri*cl there if le'ail 
rc*L!istration is rec|nired. 

12) .\ e.eneral ('ensus shonhl inclucle the names of the pojml.'ition (i.e., 
should be "uomituir'). 

(5) .\s far as ])o>^ible the c ensus slnmld be* t.aken in one* day, or at least 
be reported to a li\c‘d day and an appointed hour. 

(I) A ('ensus should be* taken at Ic*ast every ten years, and in years 
endin'.' in (t. 

(5) 'The “essc-ntiar’ information cfimj»rises: 

(</) N.nnes .and ;^iven ii.ames, (/i) Se.\, {•) Aee, t'/i Relationship to 
heacl of family, ie) ('ivil or conjugal st.ile, (/» Lrofe'^sion or cm c upalion, 
(.iji Religion, (//) I...an.i'uai'i* .spoken. (/) Knowlc/cliie c)f rc‘aclin;sand wril- 
in.ir, (/) Origin (e.xtraclicjn‘, place f*f birth, nationality, 'h) I’siial resi¬ 
dence and nature c)f sojourn at place of re;^isiration, (/) Blindnc'ss, 



10 Popiilalion Statistics ard Their Compilation 

deafness, mulencss, idiocy, and mental aberrations. [The desirability 
of includin.i' Ihese details of infirmities is now seriously questioned- • 
see par. 27 here.] 

All other information is o])tionaI. 

(6j Where the dc^^rce of yiopular intellij^cncc permits, and especially in 
lar;;e ciLics, aj^e should be .staled by year and month of birth. When the 
a"e is cx|»ressed in years, it should be a^'C last birthday; for infants—in 
completed months. 

'riie desirability of attainin*' the ^^realest practicable dejjirce of uni¬ 
formity in the census procedures of the different countries has been em¬ 
phasized a^^ain recently by the projiosid, made in 1943 by the (.'hairman 
(A. Area rarn"*, then National Director of Statistics in Peru) of the (Com¬ 
mittee cm Demographic Statistics of the Inter-American Statistical Insti¬ 
tute, for a heniis])heral pojmlation census by all the 22 American nations 
in 1950, which would observe the following minimum standards (see 
“Kstadistica,” Journal of the Jnter-Ainerican Statistic.al Institute, No. 9, 
1945, p. 11): 

(1) Kach census should be taken in 1950, or before June, 1951, at the 
latest; (2) the census or^'ani/ation of each nation should be centralized 
with a hit'll dej^ree of autonomy, thereby f^uaranteein;: a permanent staff 
of qualified technicians; (3) those nations which lack adecpiate maps for 
census enumerations should i»repare them; (4) each census should be 
publicized, and the efliciency of the or^'anization checked, by niakin*' pre¬ 
census counts, or a trial census; (5) the census should be taken by the 
“canvasser” system (see par. 10 of this Study) with atlequately instructed 
enumerators, and should be completeil in the shortest possible time; (6) 
all detinitions in the sc hedules should be clear and concise, and to facili¬ 
tate international comparisons the (|uestions concerniii;^ ai^e, ])lace of 
birth, education, oc'cu]>ation, industry, economic ])o.sition, and relalion- 
ship to head of family .should be uniform in all the countries; (7} rccom- 
mendnl international nomenclatures (such as the Lea.i'ue of .Nations’ 
classification for the >;ainfully occupied po]>ulation) should be used when 
available; (iS) where census omissions are supplied by estimation, both the 
enumerated and calculated ])opuIations should be ”iven; and (9) the 
results should be published within a definite time limit national results 
havinj^ priority over those for minor ]K)litical divisions. 

‘72f Facto" and “/7c Jure" Methods 
9. In addition to the preceding sujr^estions it was resolved at the 1872 
International Statistical C^)n^ress that the enumeratiems should be “de 
facto” and not “de jure,” on account of the greater simjdicity of the banner 



The (\'usus 


11 


method. This nvommendation, however, was not in harmony with the 
opinions of the T.S. representatives, who ^jave the followini; reasons for 
their views (see ]>ulilications t)f 4drd C'onuress, 1st session, House Kxee. 
Hoc. No. 28^, Serial \o. Iftl5): 

((/) The ‘*de jure” j)opulalion is the iKTmanenl ])opulation, whieh is 
what is desire«l; and 

{h) Kxacl facts relatin'^ to lln>se habitual residents can rea<lily ami 
correctly be olilained, wliereas it is ditlicult to so treat the tloatini^ 
population. 

(i) Tnder the “ile facto” method ]>ersons actually travelim; are in¬ 
dexed with diiVii'ulty; and 

(r/l It would result in ernmeous and unjust information concerning; 
certain communities which mi'jht l)e tem|)orarily inllated or de¬ 
pleted at the time of the census. 

'I'lie .irreater theoretical defensibility of the “de jure” nuMhod, as a 
result of its attempt to obtain the tme j;eo;»ra|)hical distribution of the 
po])ulalion, is now wiilely rei'o.ani/ed. In consulerin.L; the* plans for the l‘J21 
censuses of the Tnile<l Kinc.dom the Royal Statistical Soiiety ur^;ed 
that, altlioudi, “in order to preserve the continuity of »)ur national 
statisiiis, the ‘de facto’ j)opulation should be obtained, whatever other 
methods of tabulation are ad<led, it is desirable that a ‘de jure’ tabulation 
should also be made, and the (’ensus authorities should also be asketl to 
» «jnsi«ler whether the sc hedule Ciiuld be so amended as to ])rovide, in case 
of viMtors, a statement of their usual jilace of residence. By this method 
the number of visitors (i.e. those jktsous who have a more permanent 
residence ebewhere) could be subtracied in the nn'ords of the district 
where enumerated, and transferred to the district when? they usually 
re.sidecl. . . . 'The ‘de fac to’ eniimeratio'i eives only an instantaneous pic¬ 
ture of the po|>ulatioii where it c hances to be on a selected Sunday niehl, 
so that travellers, \isitors . . . and ])eople away for the wec*k end are 
counted in di>tricts where lliey have no |)ermanent residence. 'I'he fact 
that in ll\in;.' the* dale of the (’eiiMis a ni'jht is chosen on which it is pre¬ 
sumed that I lie minimum number c»f iterscjiis will be away from their own 
heunes shows that the aim of those responsible for the* ‘de fac 1c)’ censuses 
in the* pa^t has been to ajiproximate then? as nearly as pcjssihle tc) ‘de jure’ 
enumerations. 'I'he latter therefore have been tacitly acknowlecl.i;ecl to 
form the* ideal method of presenlini; the* Census results, thouidi practic al 
considerations of accurac^y and convenience mi.Ldit rend(?r that ich*al 
diI'lie lilt or impossible of attainment. . . . I'or all such c|uestions as the 
a))))orliunment of electoral areas, munii'i|ial status, er|uali/.ation of rates, 
hcmsin;f and so forth the Me jure’ population is eviclently the appropriate 



12 


PopiihiUon SUilislics and Their Compilation 

measure, and ihe ‘de is only li>lerablc as a substitute in so far as it 
ai>i)roximales to the ‘de jure’. . (see Kei>()rl on the (’ensus, J.R.S.S., 
LXXXIJI, 1.14 j. 

'riie cvnsuses of Die L'nited states and ('anacla, aa oniinj'ly, are made 
on the “de jure” basis, and the r»iost rerent c ensuses of (’’osla Kira, Cuba, 
Mexico, and Nicara^'ua were also “de jure” (see K. Luna Vej^as, “Metodos 
de los ('ensos de l*obIaci«'n de las Xaciones Ainericanas,” J‘!stadistiea, 
in. No. 9 for Mairh, 1945;. 'I'lie “de facto” system, however, was used 
in (beat Hritain until 19.11, when the Royal Statistical Society’s recom¬ 
mended chan;:e was adopted by inrludin^^ als(» a question as to usual 
residence; and it has bi‘en followed extensively in Kurope and in I.atin- 
Amer^an countries (llolivia, ('olumbia, Cliile, K1 Salvador, C'luatemala, 
Honduras, Panama, Peru, and V'enezuela - see R. Luna V’e.i:as, loc. ciL). 
Many countries (e.'^., Prance, I4el;-'ium, (lemiany, Norway, Italy, Spain, 
l^jrtu;^al, and llra/il) now j)re.sent their liata for both “de facto” and “de 
jure” j)opulations. 

noitseholder' and “(MvUnnh 

10. The irroblem of “de facto” or “de jure” enumeration, however, is 
intimately cormected with the question as to who is irrimarily responsible 
for lillini' in the particulars on the forms. In P.uro|)e and most parts of the 
llriti.sh Commonwc*allh (excejU C'anada and India) the eeneral practice 
is to employ a method which may be lalled the Householder method, by 
which “the oicujiier of eac h dwellin;f is held responsible for furnishing a 
written nvord of the desired particulars relative to the inmates of the 
dwelling occu|)ied by him” (siv Wic kens, J.L.\., XLIII, .14); while in the 
l-nited States, (\-inada, and India the Canvasser system is usually em¬ 
ployed, whereby the enumerators themselves elic it the desired informa¬ 
tion by direc t en(|uiry. Neillier the system of “de facto” and “house¬ 
holder” enumeration, nor the .\nierican combination of “de jure” and 
“canvasser” methods, however, is entirely satisfactory. With the “de 
facto housi'holder” system the actual count of |H)]»ulation for the whole 
c’ountry is |)rc>bably fairly acc urate, since no adjustments have to be made 
to the population which was in fact within the enumeration district on the 
census day; but the aiijcroximately true gecj;.'ra]>hical distribution is imt 
obtained, and it is stalc*d on [)p. 1.1 1-1 of the “(leneral Ke|)orl (with 
.Vppendices* of theCVnsnsof Kn; land and Wales, 1911“ (('md. iS19l) that 
“the transfer to the householder of the cluty of record can be regarded as 
advantageous, if at all, only provided that the scojk* of the census enquiry 
is to be severely restric ted,” because “the census schedule is an elaborate 
and in die nature of things a dillicult form to till in, and the average house- 



The Census 


IS 

holder is a ]>erson wilhoul much clerical nr literar>’ Irainin;:, and (|uite iin- 
acciislome<l to the formidable form with which he is confronted.” I’mler 
llie American “<le jure canvasser” system, an the other hand, "the «le- 
termination of the ‘usual place of aboile' is admitlediy one tif tin* .greatest 
diiriciilties of the emnneralors, who are instructed Iti count |)»*rsons 
tem|)orarily absent from their 'listricts but not to enumerate i)ersons 
temporarily present ‘unless it is i>ractically certain that they wouhl not be 
enumerated anywhere else.' Particularly therefore in the case of men - 
wln) are nuire likely than women to be away froiti home there is danger 
of i*onsi«lerable ernir by unintentumal omission or duplication . . . and 
the employment of ‘de jure' |)o]iulations in the calculation of rates of 
mortality renders ihedilliculi ({uestion of the distribution of non-resident 
and institutional deaths tf) their usual resiliences a maitiT of jireat im¬ 
portance, for otherwise the deaths and the poj»ulations from which they 
arise will not correspond.”“ 'I’he “canvasser” method, however, nolwith- 
standiiiL' its hi.i'her cost and its dejiendence on the elVn ieiu v of the 
enumerator;, is justilied by the more elaborate en(|uiries which can be 
made wlu-n the information, as in that system, is oblaini*il directly by 
ot'ticials who are familiar with the recjuirernents of the si hedule. ami by 
the fact that it seiiires more reliable information from colored and lorei ni- 
born populations amoii;.: whom tht‘ percenlaL’e of illileraiy is ;»enerally 
hi-'h. 

.\lni h consideration has been I'iven by census authorities to the defects 
of the above systems, and the steps which miyht be taken to improve 
them. 'Phus in the I'niled Slates the “xjiense and jiossible inellicieniy of 
the enumerators under the “canvasser” system is rccoeni.'.ed, and at the 
P>lt) cniMis the “householder” method was tried esperimentally f<»r a 
small set lion of the {topulalion but, {irobably because no piaially was 
provided in i a>eof failure to till in the schedule, only a small numbiT were 
lompleted and the re>ulls were nf>t salisfai tory.t In (Ireat Hritain, while 
the shorlcomin-js of the “hiMiselioIder'* system are admitted ollicially (see, 
for e.\am|»le, the (leneral Report on the 1‘>11 ('en^iis, pp. 11 11), ami the 
adojition of the “« anvassi-r” rnelhoil was ur;.'ed by the P>21 (’en.-u:'i Com 
mittee of the Royal Slatistiial Soiiety isee J.R.S.S., LWXlll, l.kS), it 
was then su.i'eesled that the American “canvasser” ‘-yslem has I lie dis- 
ad\anta;;e that “the l oinjiilalion of the .si hei'.ules oci upies a coiiMiderable 
lime, two weeks to one month we umlerslaml, durin.i: which ajipiM iable 

* piijur !iv It. 11. \V'»iri‘!nlrn. "(mi lln- Mi-ilutil^ .hhI I’liiilii :iiimM 
• •f ihe t'liilrij Slalr.s Uijii-.iu,” I. Will, .?!>•*, and /tar. 'in lien:. 

t /hi,I. 



14 Population Statistics a}id Their Compilation 

movement of population must lake place.”* A modified combination of 
the ”dc facto” and ‘'canvasser” methods was consequently then com¬ 
mended, on the lines of the system employed in India —in which the 
enumerators visit every house prior to a iixed census day and obtain the 
necessary information “for every i^rson habitually livin*? in the house, 
and when the census day comes round he revisits llic house, inquires who 
have sle])t there the previous ni;<hI, strikes out the entries of any absentees 
and has to record on the busy day itself the fads only with rcj.':ard to new¬ 
comers, who must form but a small ])roixjrtion of the total population” 
(op, ciL, Cmd. 8491). 

Date, of Census, and Pntercensal Period 

11. Just as in the case of the above matters, considerable discussion 
and diversity of practice liave arisen from some of the other recom¬ 
mendations of the 1872 (\)nj»ress. 'I'hc dates on which the various censuses 
arc taken are by no means uniform. 'Phis, however, is a (piestion which is 
necessarily alTected by local conditions, because it is clear tliat any 
census should be made at a lime when <listurbances of ])opulalion on 
account of j^eneral and special holidays, fairs, relijdous festivals, etc., arc 
at a minimum. A^^ain it has frequently been ur^ied by authorities of hij^h 
slandinj; that a census should always be taken at least once every five 
years. 'Phis is obviously desirable not only for le^j^islative and jzeneral 
econoinu' puqxises l>ut also to facilitate the com])utalion of reliable rates 
of births, deaths, and marriaj'es, because, especially in countries which 
ex|)erience lar^^e waves of iminijiration, it is very dillicult to estimate 
po|)iilations over a ten-year period. Clreater familiarity with the im¬ 
portance of the census, and closer attention to accuracy in supplyin.!' the 

* 'I’his ohjection, ImwI'vcr, is of nuirsr ilralt with rarrfully in llu- Anu-rkan s>slcm. 
Ill llir l*)t0 II.S. irnsiis, fur r.\:imple, “sprcial pruvisiim was maili- fur llir rniiimTaliun 

of IransiVnls.\s the census cnutiu'ralinn iscarrinl on over a iicriiMl of si*vlt;i! weeks, 

tran.sients may he mis.s(*fl hy (he enunieratnr if (hey mt»ve rliirin;' flic cnunerafiun 
perintl. In onlcr In avoid this conliiiKciicy, .\pril S was sel asiilc as the day when ihc 
usual places of residence of transients in all cities woulil he visited hy enumerators. . . . 
Ahsent family sche<lules and non-resident si'liedules were useil more e.\tensively than 
in previous censuses, and were mailed direi’lly to Washington for allocation to the 
proper enumeration districts. .\ cani for new (»rcupants was left in all vacant dwelling 
units to insure the enumeration of persons nioviriK during the census period. 'I'he use 
of lhe.se supplemental forms was aimctl diriTtly at securing a more complete count of 
transient population than in previous censuses. It has been necessary to check these 
\arious forms against the names on the population schedules to avoid duplicate enumer¬ 
ation. It is evident that a goml enumeration of transients has been secureil” (.\nnual 
keiMirt of the Secretary of Commerce, ]i. 41). 



The Census 


IS 


information, would also attend more frequent enumeration. Prior to the 
interruptions caused by the War, the custom of takin;^ a quinquennial 
census was actually cslablislicd in several countries, notably Denmark, 
France, (lermany (thoui^h with considerable latitude in its application), 
Honduras, Sweden, the L’nion of South Africa, and \ew Zealand. Tn 
Creat Britain authority to take such a census was conferred by the 
('ensus Act of 1*^20 (see par. 14), and for more tlian fifty years the Koyal 
Statistical Soc iety (see its “Memorandum Reeardinj' a Quincjuennial 
ren.sus,” J.R.S.S., X(.'V1II [iyd5], 52d) has steadily advocated, thouj'h 
hitherto without success, the taking of a ]>o])ulation census every five 
years. In (’anada the custom of takin;^ a (juiiupiennial census in the 
western jjrairic Provinces has now been established for some time (sec 
par. Id). 

Sc'veral other methods have also l>een put forward on various occasions 
with the object of securin': reliable population data more often than 
decennially or even quinquennially. A restricted quimiuennial enumera¬ 
tion by the usual methods, or a postal enumeration, inii:ht be made, or 
automatic rej^dstration of voters mi«:ht be e.xlended “to embrace all house¬ 
holders, who inieht in certain years at least be re(|uired to state the a^c 
and se.\ of each member of the household” (see op, rii,, ('md. S401, p. 14, 
and “'I'lie (Vnsus Methods of the Future” by K. Dana Durand, J.A.S.A., 
XlII;. Durin.i: the War, national rejrislrations, for the issuance of ration 
books and other war purposes, were made in the United Kinj:dom (sec 
United Kiii.irdom, Xational Re^i.sler: Statistics of Poiuilation on 2yih 
September, by Sex, A.i^e, and Marital ('ondition; II.M. Stationery 
Office, 1*>I4) and several other countries. In the United States the Bureau 
of the ('eiisiis has devised a plan for an annual samjde c'ensus of pojmla- 
tion and aL;riculture (see Pliili[) M. Hauser’s pa]>er on a “I*roposed Annual 
Sample ('ensus of J’opulation,” J..\.S.;\., XXXVH [P>42], 81); and in 
P^4.< a new national sampliiii: procedure was ])laced in operation to 
.secure monthly e^ilimates of the numbers by a;:e and sex of aL;ricultural 
and non-af:ruultural workers, non-workers, and the imenqiloyed. 'I’he 
develojnuent of sampliiiL^ techni(|ues in census work has also led to the 
suixuestion in Kn.i:land ti nt, the “continuous enumeration” of consecutive 
representative samples mi;:ht be undertaken (.see J. P. Mandeville, “Im¬ 
provements in Methods of ('ensus ami Survey Analysis,” J.K.S.S., UIX 
11946], 111, and discussion thereon by H. O. Hartley, pji. 126-27). Most 
nations, however, so far have ])rcferrcd the decennial census as the 
foundation of their .systems for enumeratini: population. 



16 


Population Statistics and Their Compilation 


The L 'nited States Population Census* 

12. The first fle( ennial census of the United States was made in 1790, 
and llic most recent as of April 1, 1950, in accordance with Article I, 
section of the Constitution, which is, in part, as follows: 

“'rhe actual enumeration shall be made within tlirec years after the 
first mcetin;^ of the Con^Tcss of the United States and within every sub- 
secpient term of ten years, in such manner as they shall by law direct.” 

'I’lie census of 1S«;0 was the first which coverc*d the areas now forming 
the 48 States anci the District of ('olumbia. 

l-ntil l<;tyi (inclusivej the Census Office established for the taking of 
each decennial census and the com]nlation and publication of its results 
had been a temporary institution, going practically out of existence at the 
< oncliision of its work. On July 1, l‘K)2, however, under authority of an 
Act of ('ongress passed in March of that year, the Census Oftice became 
a ])ermanent branch of the Department of the Interior under the name 
“bureau of the (Vnsus”; later it was transferred to the Department of 
C'omnierce and Labor, and in 1913 to the De|)artment of (’ommerce. 'Die 
establishment of this permanent bureau efTwtod great im]>rovements in 
the personnel of the ('ensus Ofilcc, and resulted in better and more 
systematic or‘'ani/ati<in throughout. 'I'he ultimate result has been a great 
broadening of the s<*o])e (»f the bureau’s investigations. It has been stated 
by the Ri'gistrar-Oeneral of England and Wales ((jcneral Re]>ort, Onsiis 
of l\ngland and Wales, 1911, p. 13) that the United Stales “produces a 
more elaborate lensus, |)robably, than any other coiinlr}'”; and it will 
prove benelicial for the reader to familiarize himself with the contents of 
the most recent census re]>orts and the manner in which the facts are 
presented. 

At each census until that of 19,10 inclusive, all the items covered by the 
en(|uiry wen* rec«»rded on the ]>oj>ulation census schedules for every 

* Siv also tlic iKipiT tluTL'firi hy II. II. Wnlfonflcii, T..\..S..\., XVIII, 2^il), ami diV 
ciissiiiii liy J. S. 'rh(iin|isnii; ami, for mure ilelaileil informalirHi, “The Slury of the 
(Vnsus” already mentioned, “The History and CIrowlh <if the H.S. t'ensus” hy ('. 1). 
Wright and W. C. Hunt (Washington, Senate Doe. \o. l‘M, .S(»th Congress, 1st Session), 
and “'I'he Ihireau of the Census; Its History, .\elivities, and Organization” hy W. Slull 
Holt. Ft should also he noteil that in the individual States many censuses have heeii 
taken on various ilati'S helweeii the national decennial censuses; they are mainly intend¬ 
ed, however, for purposes of State legiskativc apportionment, and they arc not generally 
used for statistical analyses on account of hick of uniformity in ihilcs and details (see the 
UMS publication of the CVnsus I-ihrary J’rojert siionsored hy the U.S. lUireau of the 
Census and the I.ihrary of Congress on “State Censuses; An Annotated Uihliography 
of Censuses of Population 'I'aken after^the Year 17*^) hy States and Territ«)ries of the 
Ciiitiil States”). 



The Census 


17 


individual in the country. In the l^>40 cciimis,* however, a sampling' pro¬ 
cedure wa.s employed for the first time for certain items of a suhsiiliary 
character (concornini: nativity of parents, mother tontnie, veterans. Social 
Security status, occupational shifts, and fertility^ by askini^ “Supple¬ 
mentary (finestions'’ only for a 5' I sani]ile which w.iS obtained by com- 
plelin.L' those questions for two marked lint's out of forty t>n each of tlic 
two identical sides of the ]) 0 ])iilation schedule. 

In the l‘>5() censiisf the schedule was chaii.ned cimsiderably the back 
for the first time beinij wholly oct tipied by housini; items so that the 
“pojnilation schedule” of earlier censuses became in l‘)5t) the schedule for 
the ceiiMis «»f “p<ipulati<»n and housiiij^.” The main ])ersonal tjuestions 
whicli were asked in res])ei*t of every individual enumerated wtTC name, 
relationship to head of household, race, sex, a«;e last birthday (exce|)t for 
infants under one year of a.ue, for whom month of birth was re(|uired), 
marital status, state or c<»untry «»f Inrtli, and whether naturali/.etl if 
forei.Lm born; and, for jiersons 11 years and over, eiyht more (|uestions 
covered employment and unemployment in the previous week, and 
occu|)ation, industry, and class of worker, d'he samplin': als<) was w ideneil 
by askiin: additional <|uestions for persons whose names fell on six sample 
lines of the thirty-line schedule, and still further ([uestions for each person 
on the last of the six sample lines. 'I'he adclitional ((uestions in r(‘S)>e(i of 
every name on the six sample lines (jiivinj; a i sam|)le) concerned 
migration, nativity «)f ])arents, highest j.*rade of school attendetl, and 
school attendam e since February 1, and for those ai'od M and over other 
queries were included as to ]ire.sent unem])loymenl ami work last year, in- 
lome in 1*U‘), ami military service. 'I’he further cjuestions which were 
asked on the last sam|)le line (for a .V/ i sam|>le) concerned marria;:e, 
number of children ever liorn (excludin;.^ stillbirths), .imi in some cases 
the individuars kind of work and industry in his last job. .An auxiliary 
‘‘indi\idual census report” was used for the eninm'ration of |)ersons who 
were away from their usual places of abode, in order to |»rovi«le a «*heck 
upon their jiroper entry on a remilar population sciiedule where they lived. 
An “infant l ard” also was used to record informal ion for every infant 
born in January, February, or March of the fjuestions includinj: 

the dale of birth of the infant, sex, the ai^es last birthday of the father and 

* rur tliiU iriisiis Xhv many r«imjili-x pnrtiilun-s iiiMilvid in tin* nniijiilalitin nf ||u* 
"I tif'diili's and auxiliary lornis liy thr iit-iil i-nunirratnrs, and in llii-ir Mdi'^r(|Ut‘iif filitin^, 
oifliiiK, and talmlatinn t>y tlu: oilice stalls, vmtl* tfstcrl in advaiirr liy a trial census, 
wliicli was taken ns nf .Xuku^'I 14, for two cnunlics in linliana. 

t The PJ.^t) census methods were tested in ailvance Iiy thn-e full-scale trial <-enMise'i 
for certain Idealities which were made in and 1949, and by a further “prelesl’* nii 
a small scale in 1949. 



18 Poptdation Statistics and Ttteir Compilation 

mother, the father’s occupation and industry, and the order of birth of 
the child (i.e., whether the 1st, 2nd, etc., child excluding; stillbirths). 

The Censuses of Canada 

13. As stated in fxir. 6, the first census in Canada was taken in 1666 
in the Pnivinccof Quebec (then called La Nouvelle France), and is often 
claimed to be the first real census of modern times. “Still earlier records 
of settlement at Port Royal (1605) and Quebec (1608) arc extant; but 
the census of 1666 was a systematic 'nominal’ enumeration of the people 
(i.e., a record of each individual by name), taken on the ‘de jure’ principle, 
on a fixed date, showing a^e, sex, occuy)ation, and conjugal and family 
condition. It is, therefore, clearly a census in the modern sense, and not 
a mere rej)ort of settlement, like its precursors” (see the First Annual 
Report of the Dominion Statistician: “1'he Dominion Bureau of Sta¬ 
tistics, Its Origin, Purpose, and Organization,” 1919, and the 1921 Census 
Reports). After the British conquest in 1763, censuses of U|)|)cr and Lower 
Canada, Nova Scotia, and New Brunswick “continued to be taken at fre¬ 
quent though irregular intervals”; and in 1847 an Act was passed provid¬ 
ing, inter alia, for a da'cnnial census, under which censuses were taken in 
1851 and 1861. After Confederation a census under a. special Act was taken 
in 1871, and thereafter a decennial census has been made for the whole 
Dominion as required by the Census and Statistics Act of 1879. Quin¬ 
quennial censuses for the rapidly-growing North-West I'crritories and 
Manitoba were inaugurated in 1885 and 1886, were repeated for Manitoba 
in 1896, and were extended by the Census and Statistics Act of 1905 to 
cover Saskatchewan and Alberta as well as Manitoba. 'Fliis Act of 1905, 
also, made the Census Ofllce for the first time a permanent Bureau, which 
was enlarged by the Statistics Act, 1918, into the present general sta¬ 
tistical oflice called the Dominion Bureau of Statistics. 

The usual census for the whole country was taken as of June 1, 1951; 
the latest (|uinqucnnial census, covering Manitoba, Saskatchewan, and 
Alberta (l)ut not British Columbia or tlic North-West 4 erritorics), was 
made in 1946. 

The 39-colunin population schedule used in the 1941 Dominion census 
was of the same general type as the 34-column schedule of the 1940 U.S. 
census, with naturally some variations in arrangement and in its bi¬ 
lingual (Knglish and French) wording, and with the addition of ejuestions 
on years of immigration and naturalization, ability to speak English 
and/or French, and religion. 

At the 1951 census the questions, which followed the same general 
pattern, were reduced to 29. The replies were recorded directly on indi- 



The Census 


19 


vidual ranis (about seven inches square printed on both sides) called 
“mark sense doruments” -the enumerators markiniz the data on desij^- 
nated oval spaces with ink which will carry an electric current. 'Fhe 
mark sense documents were next “read” by a specially dcsiy^ned “docu¬ 
ment punch,’* eqiiipj)ed with brushes which pass over the document in 
such a way that when the brushes come to an ink mark an electrical cur¬ 
rent is completed and a hole is punched automatically in a punch card 
corres]>ondinj'ly placed. An ele(Mn)nic statistical machine was then used 
to edit these punched cards (by rejevtinj^ for examination cards showing; 
certain inconsistencies), and as the principal eriuipment for makini; tlie 
linal tabulations. 'I'liese mecliani/ed ]>rocesses were decentralized in live 
re.L^ional statistical ollices across the countly, from which the verified 
])unched cards were sliij»])ed to Ottawa for tabulation. 

In order to test the many innovations thus introduced in the 1951 
j»rocediires, two trial censuses were taken in 1919 the first in Ottawa, 
and a second lar;er one in seven centers across ('anada. 

The Censuses of the I nited Khif^dom and Eire 

14. 'I'he censuses of (1) 1‘ji‘jland anti Wales, (2) Scotland,* anti (^) 
Xorthern Ireland (which now tt);»ctlicr constitute the I'nitetl Kin^dt)m) 
anti (4) Kire (the Irish Free State) have always been untler the control 
of separate authorities, althtmeli the t)r)J:ani/.atit)nal methods and the 
ftirms t)f the reports have jjenerally exhibiteil consitlerable uniformity. 
Until the passiiiv: tif the ('ensus Act of 1920 each decennial census (sec 
par. 7) was pnivitletl ftjr by special legislation I'!n^land ami Wales and 

* In .S'lilKirul, Sir Jnhn Siiit lair’s first “.‘^lalislical Atrnimi nf ScdIIjuuI** which 
was started in 1701 and cnnipleted in seven years, and the “.'^enmd .'^lalislieal Acctnint” 
wliicli was |ire|iaretl i»n the same lines hy the ScMi'ety fi»r the Suns and I)an};hters t»f 
Ihe ('ler^»> cif (he (’hiireh nf Sent land hetweeii lS.tl and l.S-15, are notewnrlhy as heiiif? 
iinii]Ue enquiries inln the “way nf life*’ as well as the merely statistical characteristics 
nf the |in|inlalinn. 'fliey were not censusi’s, uf course they were hascil on elahorate 
questionnaires which were sent to every parish minister covering “thi* state of the 
countrv, for the )iur|inse of .’iscerlaining the qiiaritiim r>f happiness i-njoyed hy its in- 
hahitantsand the means of its further improvement"; J. (i. K.\d (the l<c;ristrar (ieneral 
for Scotland) has staled, however, that from these invesli;;ations Scotland “has a more 
jirec ise knowled;;e of the wav of life «>f her people in hyKoiie days than any other c«)un* 
try of the worM " The ilala for a “Third Statistical .\ccfnifit" are imw heinj; compiled 
with Ihe assistance of the Niiirielil h'oundation and Ihe four Scottish l.hiivcrsities; the 
main ohjecl will he fin Kyd’s words a;'ain) “to ilescrihe not men*ly the physiial facts, 
the social conditions and inilustries. hut to show* how the people //ir, ami what they 
think ahout relijiinn, their work ami their le^iiire. . . . ’I’he promoters id this Third 
Areounl feel that unless the Im'al conriition and the view of the local people an: known, 
the jirohlems which life places before those who are plannin;' for the future will he 
dillicult to measure and therefore hard to solve.” 



20 Population Statistics ami Their Compilation 

Scot land ^'cnerally bci»^ cicall with by one Act, and Ireland by another 
(cf. J.I.A., XXXV, 365); and the census or^^anization was reconstituted 
for each occasion. That Act, however, which apjdics only to England and 
Wales and Scotland, or any area therein, is j^ermanent in its of)eration, 
and marks a great advance in its provision that future censuses shall be 
taken at intervals of not less than live years by authority of an Ordcr-in- 
('ounc il made under the Act. 

I )elails of the development of the English census will be found in J.T.A., 
XXV, S3 (“Some Account of the C'ensus, from ISOl ISSl,” by A. F. 
Hurriilge), XXXVI, 320 (“The Case for Census Reform” by (;. II. 

kyan), J.I.A., LIT, 341 (“The ('ensus of 1921; Some Remarks on Tabula¬ 
tion” by F. A. A. Men/lcr), tand the (leneral Report, 1911 Census of 
England and Wales (pp. 2 and 25). The Householder’s Schedule (cf. par. 
10) may be found in the oilicial reports, and that of 1921 in Newsholme’s 
“Vital Statistics”; and a useful comparison of the 1911 and 1921 schedules 
is given in J.I.A., LH, 345. 

Umler the (iovernment of Ireland Act, 1020, and the establishment of 
tile Irish Free State (Saorstat Eireann) by the 'freaty of December 6, 
1021, Ireland has been split into Eire (as it is now called, i.e., the Irish 
Free State) and Northern Ireland. 'I'hc statistical activities of the 
Free Slate w'ere centralized in 1923 in the Department of Industry and 
Commerce; 'riie Census Act (Northern Ireland) of 1025 established the 
separate powers and duties of the Registrar-Ocneral for Northern Ireland. 
'File usual decennial census for Ireland as a whole was not taken in 1021 
(see footnotes 4 and 10 of the table in par. 7); a census was therefore made 
in both Northern Ireland and the Irish Free State as of .April 18, 1026. 

Cnisuscs of Other Countries 

15. Full details of the development ami recent practices of the censuses 
of otlier countries, which need not be described here, may be found in the 
chapters on “('ensiises of Modern 'Fimes” and “(?ensus-taking in Aus¬ 
tralia” in Vol. 1, 1911 ('ensusof Australia (Cl. II. Knibbs, ('ommonwealth 
Slatistii ian), in Koren’s “History of Statistics,” in (\ H. Wickens’ paper 
tif XLHI, 30, and in the various references given in par. 6 and 

the footnote at the commencement of par. 7 of this Study. 



Ill 


THK RK(;iS rRA ri()N OF BIR'I'HS, OKA I'HS, 

AXO MARRIAOKS 

U>. As ;ilroaiiy the roinpiele slalislicil iitili/ation (if census 

results (le]>encls larjc^ly u|kiii their ultimate em|)l(iyment in eonjunetion 
with ]»n)]ierly eollecled annual returns of births, dealh'^, and marria.i^es 
for (inly liy that m(‘ansean the extent and eharai'terof the imieress of na¬ 
tions and ('(immunities lie nuxisured. Statist its of mi'jration, also, should 
lie included in the cate!.:ory of ess(Mitial facts, l)(‘cause with C(im|)lete 
records in all these ]iarlkulars it would he possihle to determine not only 
the numhers of the ]»eo|ile as (MUimerat(‘d hy thi‘ I'ensiis hut also their 
birth, (liMlh, marria.i'.e, and migration rates, so that tlu* ]io)»ulation at any 
time (ould be estimat<'d with reasonable ac('ura« y. At th(‘ jiresent time, 
however, rt.x'ords (jf mh^ration are very incom[ilete in most countries (see 
also par. 57). 

IliSKiKYdF ki-(;isrRAri(i\ Systkms 

17. kc’cords of baptisms, burials, and N\eddin;.*s have naturally been 
kejit by the* olliciatin;/ cler;'y from early liim'S. In Fnrope (acdirdin;^ to 
I'ld je, o[i. ci/., “Vital ke istration in Fiirope,” J.k.S.S., X(’I, 551) the 
maintenaiK'e of such records jippears to have been ori,e.inal(Ml in Spain by 
the Archbishop of 'Toledo in 1 l*/7: and Kdve draws att(aition to the fur¬ 
ther interestin-r fa('t that “wlieii Pi/arro . . . con(|uered I*eru |in 1555| he 
found . . . the admirable system of re;*istration in use amonij the Peruvi¬ 
ans, which was so elVicienl in operation as m an ely to have its counterpart 
in the history of any semi (i\ili/ed community." In Itritain the recording 
of these evcfits by the clenrv was made c(im|»ulsory in 15.<S by order of 
'Thomas (Vomwell, Vicar-(Icneral under Henry VIII. Kd-.-e has alsf» com¬ 
mented that “in 15(i5 the ('ouiu il of'Trent made ih.e heejiin;’ of re^ist(TS 
of births and niarria^^es a law of the ('atholic Chun h, and no doubt this 
order led to the introduction of such registers amon:^ the ('atfiolic ((im¬ 
munities of the various l‘!iiro)>ean Stales.” In the Canadian Provinc(‘ of 
tjuebec, moreover, the ('atholic re('ords have been maintaiiK.'d since 1621, 
and thus ajiiiear to cojistitule “the loii'^est unbroken series of rc('or(ls of 
baptisms, marriai'es, and burials in the world” (see k. K. Kuc/ynski’s 
“Ilirth ke.LUStralion and kirih Statistics in ('anada” and the review there¬ 
of in T.A.S.A., XXXII, 28i)). 

Indejiendently of lh(? ecclesia^lu al reconls, tabulations of deaths, called 

>1 



22 Population Statistics and Their Compilation 

‘•Weekly Hills of Mortality,” were compiled certainly in 1592-94 during 
the first pla;;ue in London, and probably as early as 1532 (see J.T.A., IIT, 
248, and Kaymond I^earl’s “Medical Biometry and Statistics,” pp, 32 35) 

the information thus recorded beinR “procured by persons called 
‘searchers,’ and arranged, printed, and distributed for weekly, quarterly, 
and yearly ijeriods by the (Company of Parish Clerks of London, at whose 
hall they were supposed to remain” (J.I.A., III, 249). In Ireland, tlic 
“Dublin Hills of Mortality” were compiled similarly between about 1658 
and 1772. 'Fhese old rec'ords are important historically because, amongst 
other reasons, tJiey provided the foundations for the “Observations upon 
the Hills . . published by Oraunt in 1661 and Petty in 1683 which fore- 
sluulowed the modern mortality table.* 

'Fhe tabulation of such nicords by governmental agencies, and their 
exf(Mision to include all births, deaths, and marriages as distinguished 
from baptisms, burials, and weddings, liowever, has l)een attended by 
('onsiderable dilliculty, and even opi)osition (sec, for example, J.I.A., 
XXV, 81). In its pamphlet of March, 1941, entitled the “Model Vital 
Statistics .Act” (for which see par. 22 herein), the U.S. Hureau of the 
Census has noted, with references to other authorities also, that “as early 
as 1639 the judicial courts of the Massachusetts Bay (.'ompany issued 
orders and dtvrees for the rci)orting of births, deaths, and marriages, not 
as incidents of canon law, but as matters of ‘evidence whcrcujwn the 
verdict and judgment did passe.* Ma.ssachusetts was thus tlie first political 
unit on this continent to create by judicial order an administrative-legal 
technicjue for the protection of rights by preserving evidence thereof. 'Fhe 
State iin|M)sed upon informed citizens the duty of recording with the gov¬ 
ernment all births, deaths, and marriages occurring in the community, 
and conferreil upon the recorded cx'currencc of these swial facts the char¬ 
acter of competent evi<lence.’’ Kuezynski has accordingly remarked 

* Srr alwi par. 82 here; the League (if Xation.s Statistical llandhimk Xo. 11 on Fre- 
larnl, hy (Ireenwood and Ivige; Ill, 248; V, 198; and XIX, 174. 

In liiOl Captain John (iraunt piihlished the “Natural and Tolitiral Observations upon 
the Hills of Mortality” of London, which in the field of medical statistics is widely re¬ 
garded as a classic. In l(»8.t and l(i80 Sir William Petty compiled his “Observations” and 
“Further Observations” upon the Dublin Hills. Much controversy has surrounded the 
authorship of Criaunt's work Lord Lansdowne, in his “Petty Papers,” holding that 
Petty was the author. Clreenwood, however, in his contribution on “Clraunt and Petty; 
.\ Re statement” in J.R..S.S., XCVI, 7f>, and his recent (1918) “Medical Statistics 
from Oraunt to Farr” (for a review of which sec j.L.\., F.XXV, 14b), strongly refutes 
this interpretation. A further e.xamination by D. V. Glass (“(iraunt's l.ifc Table,” 
J.L.X., LXXVI, W)) of the arithmetical techniques apparently used respectively by 
Graunt and Petty also “lends additional supiMirt to the view of Professor Greenwood, 
that Graunt’s life table was really constructed by Graunt and not by Petty.” 



The Registration of Births, Deaths, and Marriages 

(J.A.S.A., VII, 1) that AlassachuscUs “was the first Slate in the ('hristian 
worhl which rcconJed births, deaths, and inarria;;es by government of¬ 
ficers.” The records, however, were not maintained (see also “Vital Sta¬ 
tistics” l»y J. W. Trask, Supplement .\o. 12 to the T.S. Public Health Re- 
|)orts, 1‘>14, for details of the Massachusetts laws, and for a copy of the 
order of Thomas ('romwell jirevicusly inentioneil). 'Hie same authority 
has also pointed out (in his “Hirth Registration and Birth Statistics in 
Canada,” op. eit.) that the Act of I82ft in Quebec, providing for the 
preparation of returns of baptisms, marria.Kes, aiul burials by the ('lerks 
of the C'ivil (’ourts in order to asi-ertain the annual increase of the piijui- 
lation of the Province, for iiresentation to the (Jovernor and the Lei;is- 
lature, “constitutes the first start of vital statistics in North .Xmcrica.” 

'fhese early elTorts, however, beini^ hnali/ed in Massachusetts and 
(Quebec, were n«)t national in scojie. It is therefore ^'enerally claimed that 
credit for the lon^zest continuous series of national vital statistics is to be 
accorded to Sweden, for there data arc available since 17-18. 

Because elficienl central administration is obviously an essential ele¬ 
ment in maintainin;^ comprehensive records and statistics, an important 
landmark was the establishment of the olfice of the Rejdstrar-deneral of 
Knidaml and Wales by .\cl of Parliament in 18^6. d'his was followi‘d in 
North America by the jKissa^^e of the first State registration law by Massa¬ 
chusetts in 1842 (see “'Plie Federal Registration Service of the L'nited 
States” by Dr. (\ L. Wilbur U.S. C'ensus Bureau, P)U)). Under the 
provisions of these enactments, however, registration was voluntary. It 
was not made compulsory in Britain until the Births and Deaths Rej'is- 
tration Act of 1871, and in America even later.* 

RiaaSTKATFO.N l.\ TIIK UMrKI) StATKS 
18. The develo|)ment id birth, death, and marriaj'e re;dslration in the 
United States has been slow and often dillicult, princi|)ally because such 
re^'istration is entirely within the control of the various States, with the 
result that no ciMitral authority has been armed with the jiower to obtain 
the enactment of efficient laws.f It has consequen'ly been net i-ssary, 

• (Jn airount of ilu- ilirriailty •>( riiforriiij' t'oiitiiiuoiis rci'islratiiin, :iUrin|)is wi-rc 
iiimlo in llie Fnilnl Slali-sal i irticenMis fn»in IS5(lto I'HMI im iiiorlality statist irs 
liy lh<' iiiMTlioii ill the i-ciisiis stlK-diiU* of r|Uf'Mioiis relating to tliose ulio had died in 
tin* year iiiiiiicili.itrly preenlin^ ih** and a similar prailiie was follnunl in 

Uanatl.i. 'I'hc results, Imwever, vveri- rxiriMiirly unreliatilf. as the rnetliod fle|)eiifls wi 
la rarely upon the existence and nieinories of tin* relatives of ileeedents. 'I’lii:- practice, 
therefore, has now lieen abandoned in lioth rountries. 

t The Census Bureau's March, puldicatifin on the “Moilel Vital Statistics 

Act" (cf. pars. 17 ami 22 herein; reconls a nuinher of Court deci.si*ins, toj'clher with the 



24 Population Statistics and Their Compilation 

through many years of ajijilalion, to convince the various authorities of 
the desirability of inttr stale co-operation as well as of elTicicnt registra¬ 
tion within each Stale. 'I'his educational campaign was originally con¬ 
ducted mainly by the American Public Health Association (which in¬ 
cludes the majority of the foremost sanitary and registration authorities 
of the United Stales an<l ('anadaj, the American Medical Association, the 
American Par Association, ainl the Life Insurance Association of America 
(which, under its former name as the Association of Life Insurance 
Presidents, issued numerous pam]>hlcls dealing with the question); the 
American Statistical Association and the Actuarial ScKiely of America 
have taken their share in urging and helping to guide improvements (cf. 
1\A.S.A., XVIII, 27lj; and amongst Government organizations the Pub¬ 
lic Health Service, the ('hildren’s Bureau, and particularly, of course, the 
Bureau of the ('ensus have led the activities, with more recent assistance 
also from the Social Security Administration. 

19. A "'Registration Area for Deaths" was established by the Census 
Ollicc in ISiSO, For this area, which comprises those stales and cities in 
which at least 90 per cent of the deaths are properly recorded, transcri|)ts 
of the death records are forwarded to the Census Bureau for tabulation 
and subsequent analysis, in 1880 only Massiichusells, iVcw Jersey, and 
the District of ("olumbia were inclmled, representing 17 per cent of the 
population of the country; in 1890 and 19(M) the other New Kngland 
States and Michigan were admitted at various times, 'fhesc transcripts 
until 19(M) were only obtained <lecennially; since 1900, Jiowever, annual 
compilations have been made. 'Die registration area, moreover, has been 
extended gradually until ilnally all the States were included by the ad¬ 
mission of Texas in PAkL* 

20. A similar "Registration Area for Rirths" comj)rising stales and 
cities in which registration of births is at least 90 per cent conqdele, was 
not established until 1915, and then comprised only 10 Stales and the 

following stateiiu'iit: “Vital slalislks liavc Uvn consiikTctl almost fxrlusivrly to I)l* a 
State responsibility. 'Hiey are reganletl as a neee.ssary part of llie public heallli functions 
of the Slate anil, in legal theory, as an outgn>\\ ill i»f the p«jliee [)ower of the Stale. This 
has been well established and judicially acceptcHl, though otdy two State constitutions 
[those of 'I’exas and Washington] refer expressly to the subject. 'Che Federal (lovern- 
ineiit has no express constitutional power to enact vital statistics legislation of a 
national sci»pe." 

* In the “Physicians' Handbook on Birth and Death Kegistration” (Bureau of the 
IVnsus, 193*), [). 2b), a complete table is given which shows, for each State and the Dis¬ 
trict of Columbia, and separately for deaths and births, the years in which the tirst regis¬ 
tration law was enacted, the records on tile l>ecame complete, and admission to the 
rcgi.slration area occurred. 



The Registration of Births, Deaths, and Marriages 25 

District of (Columbia, ft expanded rapidly, however, and (like the area for 
deatlis) was coniplcled by the admission of Texas in 

21. The compilation of marriage and divorce statistics was a])]>roachcd 
for many years by the utilization of certain State reports and throufjh 
information obtained by the U.s. Census bureau from the records kept 
at the Slate ("apitols or the county courts. 'IMie data on which reports 
were published covered only the 20-year periotls 1867-S6 and 1887 bX)6, 
then the sinule year 1916, and finally each of the years from 1922 to 1952 
(when the annual rei)orls were discontinued because of the et'onomy pro¬ 
gram of 1955)- estimates for the missing years from 1907 to 1915 and 
from 1917 to 1921 being given in the 1926 rej)orl. 'Phe information thus 
made available was seriously inader|uate, moreover, by reason of varying 
degrees of incomydete cover.ige, and the absence (»f important tabulations 
such as those by age, color or race, resitlence, ami occupation. In 1940 
plans were made by the Hun'au of the (Vnsus for the separate creation of 
partial marriage and <livorce registration areas (covering 27 of the States 
for marriages, and 14 for divorces), which would operate through tran¬ 
scripts of a<Ieqiiate data from the original records in the same manner by 
which the bureau has gradually organized the collation of the birth and 
death statistics. 'Pliese jdans, whH*h in 1916 became the responsibility of 
the Xational Ollice of \'ital Statistics (see par. 22 here), have not yet 
materialized fully; at ]>resenl the only data published cover the total 
number of occurrences on the basis of voluntary information suy>plied by 
county clerks and otlier ollicials. A recent (1955) discussion of the prob¬ 
lems involved is given in It. Charter’s ]>a|)er “Im|)roving N^ational Mar¬ 
riage and Divorce Statistics,” J.A.S.A., XLVIll, 455. 

22. because the attaininenl of nation-wide registration has thus been 
a ])roblem of inter-state co-ojieration and the standardization of divergent 
regulations, the American I*ublic Health Association in 1895 suggested a 
model State bill, which was siibserjiiently developed with the assistance 
of various non-government:d organizations until in 191)7 it was sf>onsored 
by the bureau of the ('ensus for submission to the States as the Model 
Vital Statisticslauv (see also 'F.A.S.A., XVI11, 271 75, and they)am]>hlets 
of Dr. 'IVask and Dr. Wilbur note<l in j>ar. 17 for a coy)y of this Model 
l^aw’). Its promulgation exercised great iiilluence, and its y)rinciy>les were 
eventually ad()|)ted in every State. Since, however, it dealt only with the 
registration of births and deaths, and because its legal theory and sy>ecilic 
yjrovisions had become inadequate desynte several revisions, the bureau 
of the (Vnsus in 1958, after exhaustive study and <o-oy)eratioii with 
numerous agencies, initiated the ret'ornmendation to the States of a f/w/- 
form V//a/ Statistics Act covering marriages and divorces as well as births 



26 Poptilaiion Statistics atid Their Compilation 

and deaths, rcrognizin^j stillbirths as a separate statistical entity (in place 
of the previous practice of recording a stillbirth by the simultaneous com¬ 
pletion of a birth and a death certificate), and providing for delayed and 
amended registrations. It embodied the legal principle of preservation of 
evidence in its descriptive title as an “Act to secure complete data per¬ 
taining to births, deaths, stillbirths, marriages, divorces, and annulments 
of marriage, to authorize and regulate the use of vital statistics records as 
evidence, [and] to authorize the (Stale Board of Health) to make regula¬ 
tions for the enforcement of this Act.” For the specific purposes of the 
law, and “in order to clelimit approximately the scope within which vital 
statistics shall oi)er:itc as a governmental function” (as distinct from the 
wider inter]irelation given in ]>ar. 1 of this Study), this Act also, for the 
first time, has “attempted a definition of vital statistics” as comprising 
“the registration, preparation, transcription, collection, cominlalion, and 
preservation of data jiertaining to the dynamics of the population, in 
particular, data ])ertaining to births, deaths, marital status, anrl the data 
and facts incidental thereto” (see the Census Bureau’s March, 1911, 
publication entitled “Model Vital Statistics Act”). 

On July 16, 1916, the Division of Vital Statistics of the Bureau of the 
Census, Department of Commerce, was transferred to the Federal Se¬ 
curity Agency, and the Federal Security Administrator assigned the vital 
statistics functions to the OflTice of the Surgeon (leneral in the U.S. Public 
Health Service. The former Vital Statistics Division of the Bureau of the 
(’ensus, thus transferred, is now known as the National OITice of Vital 
Statistics (in the Bureau of State Services of the Public Health Service). 

The standard live-birth and death certificates which constitute the 
foundation of the system are shown on the i)ages following. The standard 
certificate of stillbirth, which need not be shown here, is similar to that 
for live-births with the addition of questions concerning history and cause. 

Rkgistration in Canada 

23. As in the United States, the main difliculty for many years in se¬ 
curing satisfactory reigstralion in Canada was the fact that vital statistics 
are, by the British North America Act, under Provincial jurisdiction, as 
pertaining to “civil rights.” The activities of the Dominion (lovernnient 
have therefore been restricted to those of a co-ordinating agency, by virtue 
of the powers conferred by the Census and Statistics Act of 1879 which 
])rovidcd that the Minister of Agriculture should “collect, abstract, and 
tabulate . . . vital statistics,” and might also arrange for “the transmis¬ 
sion of such information as is required, by schedules i)re])ared by the 
Census Office,” from any Province or territory where “any system is 



DCrtAL SFCIIRITY AGCNCY 
puOLIi; HFAllll SCRVKX 


iTpLACI: OK BIRTH 
N. COUNTY 


{19^9 Reriiihin of Slanilanl Cfrlifieatr) 

CERTIFICATE OF LIVE BIRTH 


j b. CITY III uimbto tMvomu Ilml'A writ* HURAL■■d tiMiahlpI 


___R!RTH.no _ _ __ 

11 2 USUAL RLSI PENCE OF MOTHER iHhwa.l.Mi^~l^) ' 
a.LTATC b. COUNTY 

C. CITY (If Mtahit carpoHM UhIi*. arito KUIUI. ud dM liWMhIfI 


C. FULL NAML OF (U NOT ta hapiudgrloMltailua.■i«aa(r«M kIiIm arlomiiual H d. STRFFT 
HOSPITAL OK U ADUHLSS 

INSTITUTION__ _ _ _ 

■~rCIIILD'S'NAME ■.'iFm-D bTiMidilfe)' “ 

(TMtforpniiO 


SCIKTH I S!i IhTdllNOn IKIPLn’(niii.hlMl...iii I b DAIC (MimlK) (Haf) (Yt 

cQ j;®n *®nl bSTth 

"father OF CHILD 


9 ACb'(AiUMofiUiUfih) 1U BIRTHPLACE f,..men ruuuin I I II 4 USUAL OCCUPAI ION 


lib. KINO OF BUSINESS OR INDUS 


MOTHER OF CHILD 


I^HII DRI^ PWCVIOUSLY BORN IQ TIIJ^MOI^LR (liii M>T mrluilr lliu rh 

nI* vn'illl II I li llmr niuiy OTIIF.K rhil- I r llowniiuiycliil'IpiiK 
I ■■ I i|iai> A i„ I iiniD wiTF bn bliip hut ■» Millhnni (birnili^il nl 
IncT I nnwdfiulT I aD«ML^|irfiniiui>>i? 


IBi. SIGNATURE 

I hrrrlni cniify fhal 

thin chid iras burn alive -- 

«n ike date Mated above, ttr. apoklss 


I 8 b. ATIINDANTAT BIKIH 
M D G MIDNirE G iH^yi 


20 Rr(ilbTKAKSSIC.NAiUHE 


FOR MEDICAL AND HEALTH USE ONLY 

ITAm sretion ^fUST befitlid uul) 


22a. LENGTH OF PPLG. ZSb. WEIGHT AI BIRIIi 21. LLGITIMAIt 
NANCY p, 

_ WFThS Ifl^ 0/5 TfsLJ 


(SPACr FOR ADDITION OF MEDICAL AND HDU.TH ITEMS BV INDIVIDUAL STATES) 
























29 


The Registration of Births^ Deaths, and Marriages 

established or any plan exists for collectini^.. . vital statistics.” Accord¬ 
ingly! in 1882 an Order-in-Council was passed empowering the Minister to 
collect “mortuary statistics” from certain cities and towns of 25,0(X) per¬ 
sons or over; and tiicse were published annually until 1891, when they 
were abandoned on the gradual organization of the Provincial systems, 
('ensus enumerations of deaths were also made at each decennial census 
to 1911 inclusive. 

In 1918, however, under authority of the Statistics Act, 1918, the 
Dominion Hureau of Statistics called a ronference on Vital Statistics (at 
which the Actuarial Society of America was re])rcsented, on invitation, by 
a committee); and it was tlien decided to cease the attem])t to collec't 
mortality statistics on the census schedule, and a sclicme of Dominion and 
Provincial co-o])eration was approved. This j>Ian provided for tlie admis¬ 
sion of any Province which should enact lei^islalion in conformity with a 
“Model Act” and which should “furnish siitisfactory evidence that it re¬ 
ceived returns of at least 91) j)er cent of all marria;^es, births, and deaths” 
" the Dominion Hureau undertakin;' the tabulation of lranscri])tions 
(microfilms now bein^ used) of the Provincial re;;islrat ions (see Rejxirl of 
("onference on V^ital Statistics, 1918; First Annual Re|>ort of the Do¬ 
minion Statistician, 19|9; an<l the first three [1921 to 192»<]of the Annual 
Rqiorlson V'ital Statistics all prcparetl by the Dominion bureau of Sta¬ 
tistics). A carefully detailed historical account of the situations both be¬ 
fore and after the eslablishuient of the re;;istralion plan which followed 
the 1918 (Nmference on Vital Statistics is available in Kuezynski s “birth 
Re.:*islralion and birth Statistics in C'aiiada.” The on'an i/at ion, which 
also resembles that of Australia, was thus jdaced upon a basis very similar 
to that of the Unilotl States excq>t that the registration areas for births 
and deaths have not been difTerenl ami marriaj'CS are included in the 
plan. As a result of this conference the whole country cxce])t Quebec and 
the Vukon and the North-West I'erritories embracinj' 73 ]>er cent of the 
total po]>ulation was admitted to the scheme in 1921. The standard 
certificates ap])roved by the Dominion bureau were adopted subsequently 
by Quebec in 192-1, the North-West I'erritorics in 1926, and the Yukon 
in 1929. Quebec finally entered the registration area as from January 1, 
1926. In the annual re])orts of the bureau the figures for the Vukon and 
North-West "rerritories arc stated se]>arately because they arc not re- 
^'arrled as complete and are so small that they arc not of material sij^- 
nificance. 

Rmc.istk.miox ln Till-; L’nitko Kincdom and Fjkk 
24. Ontralized control of re^'istration was established in Kn^'land and 
Wales in 1837, throu;,di the creation of the (leneral Register Oflice (under 



30 Population Statistics and Tlieir Compilation 

the direction of the Registrar-General) by an Act in 1836. The legislation 
was amended some years later by the Births and Deaths Registration Act 
of 1874, which, in particular, jdaced the resj^nsibility for registration of 
births upon tlie parents instead of on the local registrars (as in the Act of 
1836). Through its compulsory features and strict enforcement, the 1874 
Act secured very comidete returns, and failure to register is now prac¬ 
tically negligible (cf. par. 45). Kffective civil registration commenced in 
Scotland in 1855, and in Ireland in 1864. 

'rhe data for Kngland and Wales have been examined in a long series of 
“Annual Rejwrls of the Registrar-General of Births, Deaths, and Mar¬ 
riages,” which commenced in 1838 and in each year thereafter gave, in the 
form of “blue books,” a thorough analysis of the statistics until the 83rd 
report of 1920. In 1921 a change in style was made, the rejwrts until 1937 
iiif'lusive being ]>ul)lished in three parts—Tables (Medical), Tables 
(CMvil), and Text—under the title “The Registrar-General’s Statistical 
Review.” Since 1937, the Review has apj>earetl irregularly in consequence 
of the disturbances due to the War. In commemoration of the centenary 
(1837 1937) of the General Register Office and the registration service in 
luigland and Wales, “Hie Story of the General Register Office and Its 
Origins from 1538 to 1937” was compiled by the Registrar-General and 
published in 1937. 

The reports for Scotland and Ireland have always been prepared sej)a- 
rately by their res|)ective Registrars-General. I'hose for Scotland a[>- 
peared regularly until 1938, were interrupted by the War, and subse¬ 
quently have been brought up to date. The Irish rejHjrts have been issued 
for Eire (Saorstat Kireann) and Northern Ireland independently since 1922. 

Rkoistration in Other Countries 

25. Particulars of the registration systems of other countries, which 
need not be detailed here, may be found in Koren’s “History of Statistics” 
previously mentioned, partially also in J.I.A., XLLLI, 39, in P. G. Edge’s 
paper on “Vital Registration in Europe” (J.R.S.S., XCI, 346), and in the 
Statistical Handbooks series of the League of Nations, the U.S. Census 
Bureau’s “General C^ensuses and Vital Statistics in the Americas,” and 
the Onsus Library Project’s “National Censuses and Vital Statistics in 
Euroi>e, 1918-1939” and the “1940 1948 Supplement” (cf. footnote to 
ixir. 7). 

Tiik Fundamkntal Rkqoikemknts fok Satisfactory Birth 
AND Death Registration 

26. In addition to a central registration office in full control, and the 
employment of efficient local registrars—which are the foundations upon 



The Registration of Births^ Deaths, and Afarriages 31 

which the systems already described are orp;anized— the further essential 
provisions for satisfactory birth and death registration are (1) immediate 
registration, and (2) the use of standard forms, u]wn which the entries 
must be made by qualified practitioners wherever possilile, and again 
checked by the registration officials in order to secure uniformity of classi¬ 
fication and nomenclature; while (3) the law must be rigidly enforced, and 
its observance must be checked by requiring a burial or removal permit in 
case of death. 

These conditions, however, are not easily attained. The securing of im¬ 
mediate registration is largely de{>endent u|)on strict regulations, which 
must be accompanied by adequate penalties in case of non-coin[)liance. 
The requirement of a burial or removal permit prior to interment gives 
an automatic check on death registration; but it is dinicult to devise any 
similar check upon the observance of the birth registration laws, and this 
has contributed largely to the less effective, registration of births which 
exists in many countries. Interested readers will find basic discussions of 
these regulations in the Knglish “Rejwrt of the Selec t Committee of the 
House of Commons on Death Certification” in 1893, and for the United 
States in Dr. Wilburns jiamphlct, noted in j)ar. 17. 



IV 


THE REIJABILH Y OF CENSUS AND REGTSl RATION 
ST ATISTICS, AND THE NATURE OF THE 
ERRORS THEREIN 

'rhe preceding? parap'raphs have dealt with the /general question of the 
methods by which census and rcf^istration data may be secured, the 
or^ani/ation which is necessary for their collection, and the diversities of 
classification which may arise through the dilliculty of securing uniform 
entries by the oflicials charged with their collection. \Vc may now con¬ 
sider the nature of the errors in the resulting statistics, and the manner in 
which they may be detected and minimized.* 

J^RRORS m Ckn'sus STATISTK'S 

27. The princi])at sources of error in census statistics are (ti) accidental 
or wilful misstatements by the individuals enumerated, (6) carelessness or 
lack of training on the |)art of the enumerators, (c) the dilliculty of uni¬ 
form classification, and (//) the possibility of the entries on the schedules 
being wrongly inserted, or of the census clerks misreadini^ those entries, 
and other errors of tabulation. The rjuestions on the schedule, and their 
columnar arrangement, must therefore be framed with a view to minimiz¬ 
ing the danger of such inaccuracies. 

The accidental misstatements in (c/) are attributable either to igno¬ 
rance on the part of the individuals concerned, or to the fact that fre¬ 
quently the information has to be obtained from other i)arties—such as 
some other member of the family, or a boarding-house or hotel keeper. 
Wilful misstatements are generally <lue to vanity, to a desire to appear as 
conforming to certain labor and immigratam laws, and social conventions, 
or to resentment against the personal nature of some of the enquiries and 
suspicions of their motives. T1ic fact that this last class of error may 
sometimes be of considerable importance still is illustrated by the aban¬ 
donment (in conformity with the recommendations of the British Empire 
Statistical Conference, 1920, and of the Royal Statistical Society) of the 
questions relating to infirmities in the 1921 census of England and Wales 
on account largely of the unwillingness of the j)artics concerned to give 
the desired information. (Sc*c also Census of England and Wales, 1911, 
(Icneral Re|>ort, p. 232.) 

*Thc methods of correcting or adjusting such errors in the statistics are discussed 
subsequently, in pars. 51-54. 


32 



The Reliability of Cemiis and Registration Statistics 33 

28. The errors and uncertainties which arise from these various 
sources may affect most of the items of information on the census schedule 
to some extent, as may be seen from a perusal of the text rcjKjrts on the 
censuses of such countries as Enj^land and Wales, and the United States. 
The most im])ortant of such ditliculties with which actuarial students are 
concerned, however, are those relating to age; they arc evidenced, gen¬ 
erally, in four ways, vi/.: (1) A deficiency in the number of infants re¬ 
ported at ages 0 and 1 last birthday; (2) a tendency for disprojiortinnately 
large numbers to be enumerated at ages ending in 0 and 5, and at other 
ages ending in even rather than odd digits; (3) a natural inclination to 
overstate the age until the attainment of majority, and then to understate 
at adult ages, with some overstatement in advanced years; and (4) the 
return of the ages of some persons as “unknown.” I'liese tyi)es of inac- 
curaty will now be considered in detail. In so doing it is to be remembcreil 
that such errors may be maske<] or accentuated by past changes in the 
rates of birth, death, and migration, and <lue consideration must there¬ 
fore be given to the possible intluence of such changes in any particular 
case. 

(/) The Dcjkienry in the Number of Infants 

20. In those countries, such as England and Wales, where birth and 
deatli registrations are reliable and jmictically complete, the correctness 
of the ]) 0 |nilations enumerated by a census at the youngest ages may be 
checked by rcralculation from the birth and death statistics, and the intlu¬ 
ence of migration at these early ages may generally be ignored. 'I'he fol¬ 
lowing table, for examine (taken, with additions, from Vol. VI1, p. xliv, 
(\msus of England and Wales, 1011 Rei>orl on the Graduation of Ages, 
by George King) shows the results of such a comparison -the estimated 
imputations being obtained by deilucling the a])propriate deaths from the 
liirths of previous calendar years, on the principle of formula (6), par. 64, 
of this Study: 



MXLh f'dPl.'I.MliiN, 

A.S AT 2nd AJ’RII., 1'>11 

Ai.k 



1 Dffil it by Census 


LotiiiMtcil 

KllUIIIIT.lltM| 

— 




j 

At Iii:i1 

I’crrcntaKi* 

0-1. . . . 
1-J.... 

l-S _ 

3 t. . . . 
4o.... 

4M,13.S 
4<H)/>SS 
3%, too 

3*)0,ri82 

380,58.S 

3«)5,no 

374,10*) 

3*).S,<>19 

3SS,rifi^) 

382,306 

26,02S 
26,879 
.S7l 
2,013 
C.KCCSB 1,721 

6.18 

6 70 
.14 
.52 

cxctfss . 45 




34 


Popidation Statistics and Their Cofnpitation 

It will be seen from this table that there is a considerable deficit in the 
numbers enumerated at ages 0 and 1 last birthday, and that at ages 2 and 
over the po]>ulations are substantially correct. In view of this latter cir¬ 
cumstance the deficit at ages 0 and 1 may also be shown by assuming the 
pofiulations aged 2 and over as practically correct and adding thereto the 
appropriate deaths, in accordance with formula (5) of this Study. 

These deficiencies at .ages 0 and 1 were found again at the 1921 census 
of Kngland and Wales, although then they had decreased to 2.93% and 
(cf. T.A.S.A., XXIX, 330, and XLII, 79). The 1931 census, more¬ 
over, revealed even greater improvement, as may be seen from the follow¬ 
ing results of aj>ply Ing I he method used by King for the 1911 table already 
shown (see Lewis-Faning, J.R.S.S., C, 68, and Wolfenden, T.A.S.A., 
XLII, 79): 


Porr!L\TMNS (in 'niorS\NDS) AS AT THK 
Middle ok 1931 


Aok 


I'Miin.'iti'd 


ICnumcr- 

atcil 


Deficit by Onsus 


Actual 


IVrcentaKi! 


0 1 ... 

1 2. . .. 

2 .L . .. 
.r4.... 
4 5 ... 


605 

599 

596 

5W 

(ill 


599 

589 

595 

595 

611 


6 

10 

5 

6 
0 


.9«) 

1.67 

.50 

1.00 

0.(X) 


In Northern Ireland similar deficiencies of 3.43%., 3.88%., and 2.26%, 
at ages 0, 1, and 2 were shown in the census of 1926 (see ]>. xxxvii of the 
(leneral Rqmrt thereon, and Wolfenden, toe. c/7). 

30. These deficiencies at ages 0 and 1 appear also in the United States 
and Canada, where they have frequently been pointed out (see A. A. 
Young’s re])ort on Ages at the 1900 U.S. Census, in Supplementary 
Analysis, 12th Census, p. 140; Prof. Glover’s U.S. Life Tables 1890, etc., 
p. 342; U.S. Abridged Life Tables 1919-20, p. 9; R. Henderson, T.A.S.A., 
XXIII, 435; II. II. Wolfenden, T.A.S.A., XXIV, 132; and R. J. Myers, 
T.A.S.A., XLI, 396). On account of incomplete birth and death registra¬ 
tion the method of exhil)iling the deficiencies which has generally been 
employed in the U.S. reports is to express the number enumerated at each 
age under (say) 5 as a percentage of the total number enumerated under 
5, on the theory that under normal conditions these percentages should 
decrease steadily. In the case of the 1911 English data of par. 29, for 




The Reliability of Census and Registration Statistics 35 

example, they are 20.4, 19.3, 20.5, 20.1, and 19.7, indicating clearly a 
deficiency at age 1 (and suggesting also, though not so clearly as King’s 
method, that the number enumerated in the first year is too small in rela¬ 
tion to those at ages 2 and over). For U.S. data (see Myers, T.A.S.A., 
XLI, 396) the comparable percentages are 20.8, 18.6, 20.4, 20.3, and 19.9 
at the 1910 census, and 19.1, 18.9, 20.3, 21.0, and 20.7 in 1930. 

31. Two cx]>lanations of these anomalies at ages 0 to 1 have been put 
forward, namely, (a) that there is a tendency to omit young children al¬ 
together from the census, and (6) that there are considerable misstate¬ 
ments of age in the case of those children who arc actually enumerated. 

With regard to (a) it is suggested by George King (Supi)lement to the 
75lh Annual Report of the Registrar-Gieneral of England and Wales, 
Part 1, p. 13—see also the similar remarks in the General Rejwrt on the 
1921 Census of England and Wales, and Lewis-Faning, J.R.S.S., C, 68) 
that where, as in the data of par. 29, the jwpulations enumerated at ages 
2 and over fmictically agree with the numbers estimated from the birth 
and death returns, “the conclusion seems to be inevitable that a large 
number of infants under two years of age escape enumeration”; and this 
is undoubtedly so except to the extent that fiersistent overstatement of 
age (carried forward into the higher ages also) may explain a portion of 
the deficit. 

King’s conclusion is also supported by an investigation of the children 
born in Washington, D.C., in 1919 which was made by Miss Foudray in 
preparing the U.S. Abridged Life Tables, 1919-20. As there staled, “the 
ccn.sus returns for the District and its deatli records were searched for the 
children born there in 1919, and a form letter was sent to the j^arents of 
those children whose names did not api)ear either in the census schedules 
of Jan. 1, 1920, or on the death reconls for the District for 1919. between 
5(K) and 6(X) answers to these enquiric*s were received, and they were used 
as a basis for estimating the status on Jan. 1, 1920, of the children whose 
names were missing from tlic schedules and about wliom it was impossible 
to obtain definite information. Se]>aratc records were ke])t for white and 
Negro children, and the per cent of children whose names were missing 
from the census schedules, but who were actually living in tlie District on 
Jan. 1, 1920, was found to be much greater among Negroes than among 
whiles. The constant per cent of infants whose names were missing was 
taken as 9 for whites and 25 for Negroes.” The original statistics upon 
which these percentages were based were not given, however; and al¬ 
though they were assumed to 1)C ecjually aj)i>licablc in all other sections of 
the U.S. it is to be noted that they were derived only from the local data 
of Washington, D.C. 



36 Population Statistics and Their Compilation 

For the whole of the United States some calculations by Myers 
(I'.A.S.A., XLI, 3% 97), based on projections of corrected and estimated 
births of previous years (which had to be used instead of actual births, on 
account of the incompleteness of birth reiM)rlin;,^ in the U.S.), also indi¬ 
cated substantial deficiencies for whites and ^^reater deficiencies for the 
colored population at both the 1920 and 1930 censuses. 

These omissions may be due largely (as remarked in the 1921 census 
report of England and Wales) to “the possible and intelligible non¬ 
enumeration of . . . infants of whose births the registration details neces¬ 
sarily would not have been com]>leted at any census date, and the natural 
reluctance of parents to state the true ages of children born within a short 
time of their marriage.” 

32. With regard to (6)—misstatements of age in the case of the children 
actually enumerated -it has been customary in the English census to re¬ 
quire the ages of children under one year old to be staled in cf)mj>lcted 
months, and at the 1921 census this method was extended to the higher 
ages which were asked for throughout in years and months. In the United 
States years and months (exf)ressed as twelfths of a year) were required 
for children under 2 in the census of 1910, and in 1920 for chihlren under 
5. It was therefore thought at one time that this iv.ode of age statement 
in months would be likely to eliminate most errors, es]>ecially when ac¬ 
companied by the explicit instructions which are generally given to enu¬ 
merators to be particularly careful in obtaining such ages (cf. 1911 Census 
of England and Wales, (icneral Report, ]>. S6, and Vol. VII, pp. xxx and 
xliv). Two valuable investigations which have been made by Dr. J. C. 
Dunlop (then Registrar-Ueneral for Scotland), however, definitely show 
the existence of a marked tendency to overstate the ages of young chil¬ 
dren. Dr. Dunlo]) look a sam]>le ]K)pulation, and the children enumerated 
therein were identified in the birth registers and their reported ages i:om- 
pared with their true ages. In the first study (J.R.S.S., LXXIX, 309) the 
sample comprisctl the children of Paisley and Haddington, Scotland, who 
were under 5 at the 1911 census; in the second (J.R.S.S., LXXXVl, .^-17) 
those under 6 in Paisley and East Lothian at the 1921 census were used. 
In the former case the census asked for the age in months only for childreit 
under 1—subsequent ages being given in completed years; the 1921 
census required the age in years and months throughout. In both studies 
slightly over 83^^, of the enumerated children were successfully identified 
in the birth registers- the unlraceable balance being attributed largely to 
mignition (see also J.R.S.S., LXXXVl, 568). The detailed comparisons 
of the true and reported ages of the children actually traced were shmvn 
thus: 



The Reliabilily of Census and Registration Statistics M 




Ai:k h\sT KiRiiiiiw \.s Sr\ii ii m ('i vses Rktcrns 

I’KI K .■Ve.L I.\ST 








Kirtiidw IV Ykxrs 









n 

1 


1 

4 







VM-.11S |.f pit 1 



0. 

2,6Jrj 

142 

7 


} 


) 780 

1. 

1.4 

2,.4(1-4 

2.’«> 

2 

(*) 

included 

>;S48 

1 . 

) 

1.4 

2.176 

.’.41 

.s 

in 

2,427 

.4. 

■1 

S 

"iJ 

.’,().S1 

iri.s 

p;ii 

2,256 

4. 

1 

(i 

7 

.40 

1 

study 

1,070 

Total (11 . 

2.616 ! 

1 

2.4;.4 

2,111 

1.M1 

J.KM 


11,081 





lVl,.M. .if I'>>1 



0. 

2,6Jfi i 

76 

1 1 

2 

1 


2,706 

1. 

P) 1 

2 717 ' 

Ht 

d 

1 


2,807 

} 

1 ! 

0 

\ D71I 

S6 

.s 

2 

1,814 

.*5. 

1 j 


6 


.so 

1 

1,772 

4 . 



1 

I.S 

1 ,.X26 

.S6 

1,8W 

.s. 

... i 

1 

i ? 

1 

11 

1,7.S.4 

1,767 

Total (1 .S . ‘ 

1 

2,S().‘4 i 

1,7X7 

1,X1I) 

1 ,«;0.4 

1,812 

12,765 


The fact Ihiit llicse slalislirs tlrnionslrale a |KT^isli*nl Iciicli'iicy to over¬ 
state the a.'^e may he seen ricarly from the table on pa;^e SiS e]>itoini/,in>' 
the results.* It is to be noted, of course, that the en(|uiry analyzes only 
the correctness of the a;^es of those children who were enumerated, and so 
does not touch upon the other rjiiestion of actual omissions from the 
schedules, t 

While Dr. I Umlop’s invc'stivatums show that the a.L^'s of children under 
1 are not correctly obtained inertly by the process of askin-/ for the aj^e 
in years and months, that form of (jucstion does apj>car to have secured 
in 1<)21 more accurate results at a* es over 1 than the jwevious methotl 
of rc(juirin«f such a^^es only in t'ornpleliNl years. Jt thcrehire sirms that the 

* The liKiirrs in the Ia>l tun nilunMisuf ihat lalilc dilTi-r sli:»hll> from those Kiveii 
originally hy Dr. I)iiiilii|i in LXXXVI, .^^5, in urili-r to correiM the errors there¬ 

in |)oiiite(l out hy II. 11. Wolfeinlen in T. \., XLII, SI. 

tSec also II. ir. WollViiden and J. 'riioi.ips.»n, T..\.S..\., XXIV, \M anil KkS; 
and compare Sir A. \V. Watson’s and Mr. .‘^'Levenstin’s remarks in J.K.S.S., I.XXXIII, 
4d7 and -14.? on the similar features which are shown hy the statements as In dura¬ 
tion of marriages at the census, and Dr. Duidop’s analogous investigation of marriage 
durations in J.R.S.S., LXXVIII, 3.S. .\ further flisciKssion of Dunlop’s data is given hy 
D. V. (flass ill Population .Studies, V, 77. 

















38 


Population Statistics anil Their Compilation 

form of the age query has an appreciable influence on the accuracy of the 
stated ages (seealso the last sentence of par. 34(d) here). This suggestion is 
also made in the 1911 English census reports (General Report, p. 86, and 
Vol. VII, p. xxxi), and is further borne out by A. A. Young’s previously 
mentioned report on ages at the 12th U.S. census. Dr. Young there ap¬ 
plied the percentage test to the data of a number of countries where the 
ages are obtained in years or years and months, and also to other coun¬ 
tries where they are derived from a direct statement of the date of birth. 


1921 


1911 


Errors 


Errors 


Xiiiriljcr 

ReiJiirtcil 


Ovor- 

sl:itcnu‘nt 


Unfitrr 

stiitomrnt 


Number 

Rt'l>ortecl 


Ovrr- 

statement 


Unilrr 

stalrmcnt 


Aitual 


I’er 

('ent 


Aitiial 

( cut 


0 . . . . 
1. . . . 
2 .... 

4... 


2,706 

2,807 

1,812 

1,771 

1,84.^ 


80 

71 

01 

59 


2.96 

2.53 

5.02 

3.33 


19 

10 

10 

17 


0.68 
0.55 
0 56 
0.92 



Artiial 

Per 

Cent 

2,780 

154 

5.54 

2,548 

231 

9.07 

2,427 

236 

9.72 

2,256 

UkS 

7.45 

1,970 

i 



AlUkiI 


Per 

('till 


13 

15 

37 

44 


0.51 

0.62 

1.6t 

2.23 


0 4... 


10,939 


301 


2.75 


56 


0.51 


11,981 


789 


6.59 1 109 


0.91 


and it was found that the deficiency at age 1 was clearly shown in the 
former case, but that when the date of birth was used the deficiency in a 
number of cases did not a])])car and the ])erc*enlagcs sliowcd a normal 
decrease throughout. Similar comparisons are also given by \V. 13. Hailey 
in Vol. I, p. 294, of the 13tli U.S. (Census Reporls, together with the per¬ 
centage of illiteracy in each country; but another suggestion is there made 
that “probably illiteracy is resjmnsible for most of the anomalies shown 
and that the form of age enquiry does not materially affect the rc.sults.“ 
Analysis of a small sample from the General Report on the New Zea¬ 
land census of 1921, in which the comparisons of the stated and true ages 
were carried out beyond as well as below age 5, also showed tendencies 
closely com]>arablc with those of Dunlop’s 1921 material. For males and 
females separately the tabulations at ages 0-5 are shown in the table 
on j>age 39 (as stated, after correction, by If. 11. Wolfenden in T..\.S..A., 
XLII, 81-82): 





The RcliabilUy of Census and Registration Statistics 


39 


Truf. Age 
Last 
Biktuoav 


Ages Last Birthday as Stated in 
THE ('ensus Returns 



5 Total 


Males 


0. 

6,1 






1. 


62 

4 

1 



) 



58 

5 



5. 




57 

1 


4. 





6.1 

4 

5. 






56 

Total 0 5. . . 

6.1 

62 

62 

61 

(A 

60 


f>7 

61 

58 

67 

56 

572 


Females 



- 

— 

— 

--- 

— 

— 

0. 

61 






1. 


(yV 

5 




2. 



.S9 

1 



.1. 



1 

(lO 

2 


4. 





60 

4 

5. 






56 

Total 0 5. . . 

1 

61 

64 

6.1 

61 

_ 

62 

60 


61 

67 

60 

6 .^ 

64 

56 

571 


(Comparison with Dunlop’s 1011 and 1921 data already given at ages 0-4 
llien sliows ihe following results (as given by Wolfenden, loc, cit.), in 
whieh the similarity of the 1921 percentages for both sexes together is 
notable: 


Data 


(1) nunldp (Scolliiml), 1911 .... 

(2) Dunlop (Sentland), 1921 . 

(5) New Zcalaufl, 1921, Males ami Females 
(4) New Zealand, 1921, Males . 

New Zealand, 1921, Females. ... 


Pm entaKC 

PcrtnitaKt' 

(hcrstiiled 

Umlrrstatnl 

6 59 

.91 

2.75 

.51 

2.41 

.16 

2.88 

.(M) 

1.9.1 

..12 


(J) "'Heaping^' at yl^c5 Ending in 0 and 5, and at Other Ages 
Ending in Even Digits 

d3. The actual age constitution of a iM)pulation on any ])articular date 
may be depicted conveniently by means of a diagram, in which the num¬ 
bers at each age or age-group may be rei)resented cither by points joined 














40 


ropulalioii Statistics and Their Compilatiopi 

hy s1rai«?hl lines, or mf)rc clearly hy rectangles as in the U.S. reports. The 
contour of such a Oiacram would be perfeiMly smooth in the ease of a 
theoretical life-table pojmlalion- in which there arc no mijerations, and 
the births and deaths are absolutely uniform; but in the actual pojmla- 
tions enumerated by a census the outline of the diaj'ram will be irrej^ular, 

DIACiRA.M I 

Ae.K \\i) SiA' DiSTRiBurifiN oc Xativk Wiijtks of Naiivk 
PAR iM-Ae.i: IN'Ai;k ('rKOi:i>s; U.S. 

PER CENT AGE PERIOD PER CENT 



on account of mi.i^ralions, varyin.'c birth and death rates, and misstate¬ 
ments of a.:;e. 'rhe irrc'^ularities produced by mi;'ralion will, of course, 
only be traceable to that cause when the migration has occurred in suffi¬ 
cient quantity to affect markedly the outline at several consecutive a^es; 
chanijfes in the birth and death rales will j'cnerally be ditlicult to analyze 
in their effect on the i^cneral sha])e of the dia^^rani; but misstatements of 
age will be shown clearly by ])rojcctions and depressions at individual ages. 



41 


T}ie RelidbtlUy of Census and Registration Statistics 

The accompanying diagrams illustrate these principles. The first two 
(tJiken from Vol. T of the 1910 U.S. Census Reports)* show the age and 
sex constitutions in age groups only—the diagram for native whites of 
native parentage being a normal triangular shaj>e, and that for foreign- 

DIACIRAM II 

AC.K and SkK DiSTRIBI.'TION ok FORKf(;N-1)()KN WllITKS IN Adi'. 

(;roi:ps; U.S. 1910 

PER CENT AGE PERIOD PER CEUTT 



T€S432 1 01 234 607 


born whites clearly showing the distortion of the triangle produced by 
immigration at atlult ages, 'fhe third (from Young’s re])ort)t gives the 
distribution by single years for the whole U.S. i)o|)ulation in 1*KK), and 
clearly indicates that disjirojiortionately large numbers were returned at 

* .Morecl:il)()ralccli.'igraiMscxliiliitingsimilarrliariu'tiTistu'S in the UHO U.S. popula¬ 
tion may be bmml on p. xi of Tar! 1 (Uiiitcil Slates Summary) of Vol. TV »»f the Re¬ 
ports on the 1940 Population Census. 

t .A similar diagram fi»r the IWO an«l 1940 U.S. populations appears on j). ix of the 
publication referred to in the prcceiling focjtiiote. 



DIAGRAM in 

A(;e and Skx Distribution of Whole Popuution by Single Years; U.S. 19()0 









The Reliability of Census and Registration Statistics 43 

age 21, at higher ages ending in 0 and 5, and to a lesser degree at those 
ending in 2 and 8 -which, in the absence of any evidence that migration 
or varying birtli and death rates could produce such marked and isolated 
irregularities, must be attributed to misstatements of age. 

34. fa) 'Phis concentration on particular ages has been analyzed nu¬ 
merically in a number of Wtays. The well-known tendency to return ages 
ending in 0 and 5 has been indicated in the U.S. rejwrts by com])aring 
“the number of persons between 23 and 62 years whose ages are returned 
as multiples of 5 with one-fifth of the total number from 23 to 62 years,” 
on the theory that if there were no concentration on multiples of 5 these 
two figures would be “about equal”— the limits 23 and 62 being selected 
as covering the period in which such concentration is most marked. When 
expressed as a percentage this “Index of Concentration” gave the follow- 


ing results in 1910 (13th Census Report, Vol. I, ]>. 

291):* 


I'LASS ur Population 

InDKX OF CoN'l'I-NlKAIIoN I'h H (*KNT 
'I'llAT Xi'UIlER RkPoBTI-I) A.S Mui.lTPLhS 

OF Korm.s of <).vk fifth of 'I’otai. 
Nl'.mrf.k Ar:Ri> 2.1 ro 62 Yk.vrs Ini'lu.sive 


Male 

Female 

1 

Total 

.\alive While, Xalive Parentage. . | 

\alive White, Foreign or Mi.xed Parentage ! 

Foreign-born White .| 

Coloretl. .... .... .. 1 

1 111.4 

IIO.S 
i i.{n.6 
1.S2..S 

112 7 

114 0 
127.7 
l.vl.2 

112.1 

112 

120 2 
15.<..^ 

'Folal. ! 

120.0 

120 2 

120.1 


[h) 'File same iirincijile was useil more comprehensively in the following 
table given by T. G. Ackland in J.J.A., Xl^VII, 310, which shows the 
numbers recorded at each digit of age out of a total of I,(XX) at all ages in 
each of six Provinces of India at the census of 1011, and the order (in 
parentheses) in which the several digits were recorded (sec page 4-1). 
'Flic preference for 0 and 5 in this comjKirativcly illiterate i)o))ulation is 
very marked, with even iligits following in the order 2, 8, 6, 4, and there¬ 
after the odd digits appearing in the order 3, 7, 1, and 0. (It must be 
noted, however -as emphasized in the next paragraph, and as shown also 

* References lo the reports of the U.S. census of 1^10 are made here and el.sewhrrc 
in this iSludy despite the lapse of lime, because the descriptive text ami analyses pre¬ 
sented in thosi; volumes were particularly complete and valuable. 'Fhe 1020, I9.<0, and 
l*M() reports were published with generally miirh less analytical text. 'I'hesi^ later 
reports, however, arc of course also recorded whenever necessary as additional sources 
of information. 




44 


Population Statistics and Their Compilation 

by L. S. Vaidyanalhaii’s Actuarial Re|x)rt on tlie 1931 census of India, 
pp. 128-30 that llic last |xjsitum occupied by digit 9 results from the 
manner in which llie count was taken for Ackland’s table, and is conse¬ 
quently liclitious.) 'Fhis same order of fjreference was found by II. G. \V. 
Meikle in the 1921 census of India (see review in T.A.S.A., XXVII, 467). 
At the next census of 1931 Vaidyanatban’s Rci)ort (cf. '^I\A.S.A.,XXXVI, 
138) showed tliat the digit selection was again the sjime except that the 





iJiiiir i>K 

Ai:k Kk« 

“IRIIMI LV CkNSI S 




PkiiVI.V( hS 



-- 




- 


' 



(1 

1 



4 

5 

() 

7 

K 

9 



.\umluTs (per 

KvTilfc! 

in Kr>.pri l cif Karh Oit'il 

of Arc 


Helical . . . 

ISA 

•13 

121 

56 ! 

1 

187 

76 

57 

106 

37 


(1) 

(0) 

(.<)! 

! (s)i 

(6) 

il) 

(5) 

(7) 

(4) 

(10) 

Uoiiiljuy... 

m 

43 

110 

56 1 

(lO 

215 

66 

47 

78 

33 

0) 

(0) 

(.<) 

(7) 

(f)) 

(i) 

(S) 

(8) 

(4) 

(10) 

Uurm.'i. . . . 

187 

7o 

106 

08 

1 7« 

142 

85 

80 

84 

64 


(1) 

m 

(3) 

(4) 

1 (8) 

(>) 

(•■!) 

(7) 

(6) 

(10) 

Madras .. 

2()4 

48 

113 

61 

1 7S 

171 

80 

•kS 

W 

40 


(1) 

w 

«)i 

i (7) 

(6) 

(2) 

(5) 

(8) 

(4) 

(10) 

Fiinjiil).. . 

270 

41 

110 1 

55 

67 

108 

78 ! 

40 

8-1 

36 

(1) 

(0) 

m 

(7) 

(6) 

U) 

(S) 

(«) 

(4) 

(10) 

(.rnito<l 











PruviniTs 

201 

47 

113 

15 

65 

18f) 

83 

43 

01 

33 


(1) 

(7)| 

1 (.5) 

. (8) 

16) 

(!V 

1 (5) 

i <‘’V 

1 

(10) 

'I'olals .. 

1 ! 

1 

301 

1 673 

1 

1 -107 

! 1,000 

1 

i “^77 

j 

1 - 

243 

Mr:in v:il 


1 


1 


1 

1 

! 




IK’S ... 

2(il 

.SO 

112 

! 

j fiS 

j 18.t 

! 70 

1 

i_54 

j 8') 

41 

Ordrrofrci* 




i 



1 ^ 


1 

i 

ord. 

(1) 

(<)) 

(3) 

: 17) 

i 

! (6) 

U) 

1 

(8) 

H, 

(10) 


order of the odd digits was 7,3, 9, ami 1. Meiklc’s and Vaidyanalhairs re- 
|)orts both contain elaborate discussions of the psychology and character¬ 
istics of digit selection for a |^)pulation in whicli the percentage of il¬ 
literacy is high (as evidenced, inter alia, by the marked preference for 
digit 5 in comixirison with its later jwsition in more literate jwpulations). 

'Fhe principle of the numerical illustrations sliown in tlie preceding 
tables of this paragraph is to compare the numbers recorded at each digit 
with the total number at all digits for an extensive range of ages (such as 
from 23 to 62). Another similar method was used in J.I.A., XLVIll, 2(^8, 





The Reliability of Census and Registration Statistics 45 

where the numbers at each digit were compared witli the number at digit 
9 instead of with the number at all digits. It must be noted particularly, 
however, as |Jointcd out by George King (J.I.A., XLIX, 301), that these 
methods of dealing with digits at different ages conceal the fact that when 
the count is started from digit 0 the jjopulation at digit 0 is normally 
greater than at cligit 1, and that similarly the number at digit I is normally 
greater than at digit 2, and so on, with the consequence that some excess 
of population will appear at digit 0 in com|xirisoii with 1, and so on up to 
digit 9, without any selection of digits whatever. This natural excess of 
digits 0, 1, . . . , 8 over digit 9 when the count is started from 0 is there 
shown to range from a ratio of 1.196 to 1.022 even for the 11“ (Text Book) 
Table which, being a graduated life-table fH)pulation, should not show 
any digit preference; and these ratios will be varied wlien the count is 
started from some other digit. This natural excess of some digits over 
others, and its dependence upon the digit from which the count starts, 
must therefore be borne in mind in using these mclliods. 

(c) The assumption underlying method (a) that the numl)er at a 
particular age x would be about onc-fifth of the total jiopulation enu¬ 
merated at ages x — 2 to a; + 2--and the corrcsiX)nding assumption of 
method (6), implicitly supix)sc that the populations decrease with age in 
arithmetical progression, that is, in a straight line with a constant first 
difference. Under normal conditions, how'ever, the iK)j)ulation curve has a 
significant second difference, and the assumption of arithmetical progres¬ 
sion consequently introduces a slight error. In the Rcjiorts on the 1901 
and 1911 censuses of Kngland and Wales, therefore, the concentration at 
ages ending in 0 was estimated by redistributing the proi3ortionate jwpu- 
lations per 1,()()() enumerated at ages lOjc — 2 to lOjr + 2 only, on the 
assumption of a second difference curve with the three constants chosen 
to reproduce the group total and the numbers at the extreme ages l().r —2 
and KU -}- 2 (and so that the total of the three values in the middle is also 
unchanged).* The results w'erc shown for Knglish subdivisions and for a 
number of foreign countries, in tables of which the following is a sample 
(see 1911 Census Reports, Vol. VII, p. xxviii): 

• The preliminary redistribution.s applied by (Ircville in the U.S. Life 'Fables and 
Actuarial Tables, 19d9 41, for certain Negro puimlations and deaths fas explained in 
I)ar. 52 of this Study) produce exactly the same numerical results as this method of 
par. 34(c). In both procedures, moreover, it should be noteil lliat live values are in fact 
adjusted so that the two end values are unchanged and the three in the middle are 
redistributed subject to their total remaining the same. 



46 


Population Statistics and Their Compilation 


Proportions per 1,000 in Each 5-Year 
Group, op Wiucu tur Cen¬ 
tral Ace (lOx) Is 


10x»40 



Enumer¬ 

ated 

Calculated 

Difference 

Enfflancl and Wales, Males 

IOjc-2. 

222 

222 


lOx-1. 

198 

204.5 

- 6.5 

lOx. 

222 

195.5 

-h28.5 

\0x+\ . 

167 

189 

-22.0 

lOje+2. 

191 

191 



These tables show that generally the excess at the round number is drawn 
considerably more from the age above than from the age below (see also 
par. S3). It is to be noted that this method measures the concentration at 
each separate decennial point by using five ages only, and thus prac¬ 
tically eliminates the objection as to the natural excess of certain digits. A 
disadvantage, however, is that the age ending in 8, which usually shows an 
excessive ix)pulation, is included in the group and is assumed to be one of 
the correct values from which the adjusted numbers arc comymted. 

{d) In order to eliminate this last objection and broaden the method, 
an improved technique (which was applied to the 1911 as well as the 1921 
data) was introduced in the (jeneral RefX)rt on the 1921 Census of Eng¬ 
land and Wales (p. 73). Instead of using each 5-ycar grouy) separately (as 
in the illustrative table for lt).v = 4t) of the preceding paragraph) and ad¬ 
justing its five values by a second degree curve reproducing the group 
total and tlie two end values, the populations for the entire range of ages 
from 23 to 72 were aggregated in ten groups according to the units digit in 
the age—so that those enumerated at 23, 33, 43, 53, and 63 formed the 
first group, those at 24, 34, etc., the second group, and so on- and then 
(after reduction to a total of 1(),()()() for comimrativc pury^oses) a second 
degree curve was fitted by least squares. The table on page 47 illustrates 
the method. 

The results showed the order of digit selection as 0,8, 2, 5 (4, 613) 9, 7,1 
for males and 0, 8, 2 (4, S, 316) 9, 7,1 for females at the 1911 census, and 
0, 8, 2 (3, 9 [5, 4, 6) 7, 1 for males (as in the following table) and 0, 8, 2 
(3, 4, 519) 6, 7, 1 for females in 1921, where the bar marks the point of 









The Reliability of Census and Registration Statistics 47 

change from selection of digits to avoidance, and the parentheses enclose 
those digits for which the difference between the actual and graduated 
numbers is less than 1% of the graduateil (see II. H. Wolfenden, T.A.S.A., 
XXTX, 329). Analysis of the figures, moreover, emphasized again that 
the |x?rsistent preference for 0 is almost entirely at the expense of the next 
higher age (as remarked earlier), and that the other principal disturbance 
involves digits 7 and 8 in which “the movement may be more fairly asso¬ 
ciated with an avoidance of the digit 7 than with any si^ecial respect for 
ages ending in 8.” (Similar features appear in the 1931 census, for which 
the General Report was not published until 1950 on account of delays due 


AciKS 
Eniuno in 


PRdPfiKTIONATE 

Pdl'ULATiriN 


OlFFKRKNCR 


Artual 


Graduatcfl 


Amount 


Per Ceiil of 
tirailuuted 


3. 

4. 
.S. 
6 . 

7. 

8. 
0 . 
0 . 
1 . 
2 . 


Males, 

1,08.? 

1,(K>4 

1,0.S6 

l,a?8 

1,002 

1,020 

982 

W8 

808 

889 


1921 
1.074 
1,068 
1,057 
1,ai4 
1,026 
1,005 
979 
950 
917 
880 


4 - 9 

- 4 

- 1 
- 6 
-24 
+15 
+ 3 
+48 
-49 
+ 9 


+ .8 

- .4 

- .1 
- .6 
-2.3 
+ 1.5 
+ .3 
+5.1 
-5.3 
+ 1.0 


Total. 


10,000 


10,000 


1:84 


± .84 


to the War.) It was further pointed out that in Britain the other digits 
showed “no consistent and significant disagreement between the crude 
and graduated figures and it may be inferred, therefore, that statements 
at these ages are fairly accurate on the whole.” In comparison with the 
marked preference for 5 in more illiterate jxipulat ions, the data indicated 
that feature to be of small im|X}rUince in the literate population of Britain, 
while the selection of even digits was slight. The adoption in 1921 of the 
age f^uery in years and months (instead of age last birthday), and increas¬ 
ing familiarity with the forms and objectives of the census, apjxjar to have 
been major influences in reducing digit misstatements in 1921 to less than 
half for males and less than two-thirds for females of those in the census of 
1911 (see the 1921 (Jcneral Rei^ort, p. 75). 

35. (fl) The preceding methods of examining the ungraduated enu¬ 
merated populations are intended to show the concentration at particular 












48 Population Statistics and Tlteir Compilation 

ages. A general measure of these inaccuracies at all ages may be obtained 
from the ungraduated }X)pulations by means of a “Coefficient of Error” 
which was employed by A. A. Young in the Report on Ages at the 1900 
U.S. Census, and was also used at the 1910 census under the name of the 
“Test of ‘Minus Differences’ ” (see Supplementary Anal 3 rsis, 12th Census, 
p. 135, and Vol. T, 13th Census, p. 292). By this method it is assumed that 
in a true series of normal enumerated |X)pulations the number at each age 
would be less than at the preceding age, so that n-pu. would be posi¬ 
tive throughout and there would be no “minus differences.”* The sum of 
all these differences regardless of sign from age 0 to (say) age 99 would 
then equal the total dec rease from age 0 to 99. When the enumerations are 
affected by errors, however, there will be some “minus differences” at ages 
affected by those errors, and the total decrease from age 0 to 99 will now 
be given by the sum of all the differences regardless of sign (as in the true 
scries) less twice the sum of the “minus differences.” This sum of the 
“minus differences” may therefore be regarded as “a measure of the in¬ 
accuracy of the reported ages,” and should be expressed as a |)ercentage of 
the corrcsi)onding populations. The following are some of the results for 
certain sections of the U.S. population: 


“COKFI rciENTS OF ERROR*’—PeRCRNTACJES THAT SUM OF “MINUS 
Difficrknces” Forms of Populations Aoed 0 99 



1910 

1900 

1890 

Cl.ASS 

Male 

Female 

Male 

Female 

Male 

Female 

Native While <»f Native I’ar- 
eiitagc. 

3.0 

3.4 

1.6 

1.6 

5.9 

6.2 

Native While of Foreign or 
Mixed Parentage. 

2.8 

3.5 

1.4 

1.4 

4.4 

4.9 

Colored. 

10.6 

11.5 

9.7 

11.2 

13.2 

15.7 

— 

_ _ 







(ft) Another general measure of the inaccuracies caused by digit selec¬ 
tion has been developed more recently by K. J. Myers (“Errors and Bias 
in the Reporting of Ages in Census Data,” T.A.S.A., XLl, 402 and 411). 
In order to eliminate the inherent bias which results from starting the 
count always from some particular digit, Myers’ procedure uses a “blend¬ 
ed” count starting at each of the ten digits in turn and then averaging the 

* 'riiis assiiinpliuii is not permissible in the case of populatams markedly affected by 
migration, such as the foreign-born whites of Diagram TI, par. 3.?; and the method 
therefore cannot he applied in such ca.se^. It may he noted that the dilTcrences here used 
are the first differences of Pi taken negatively, Pi being defined as in par. 64 of this 
Study. 





49 


The Reliability of Ceptsus atul Registration Statistics 

results. This method (as he demonstrated for the [Text Book] 'lable 
for which King had shown the natural excesses over digit 9 ranging from 
19.6% for digit 0 to 2.2% for digit 8 when the count is started from 0 — 
see par. 34(6) here) removes the bias completely. 

ITie extent of the concentration or deiiciency for each digit, as meas¬ 
ured by this blended method, is shown by Myers as a jMircentage; and an 
“Index of Preference” is then taken as tlie sum, regardless of sign, of the 
deviations of these percentages from KMJJ, (for if no digit selection were 
present, all the percentages would be and the index would be 0). 
The method was applied in Myers’ paper to the U.S. populations from 
1880 to 1930, and was extended by Greville to the 1940 census (see 
“United States Life Tables and Actuarial Tables, 193‘>-194l,” p. 121) 
with the following results: 

“Lvdex of Preference” is United States 
Populations, 1880-1940 


Ccnsiua 

Index (if 
Preference 

('enau.s 

llull'X Ilf 

Prefcreiu 

1880. 

20.8 

1940 


1890. 

15.6 

T'otul Population. 

6.0 

1900. 

9.4 

While Males. 

4.2 

1910. 

11.2 

While Females . 

5.6 

1920. 

9.0 

Non-while Males. 

16.2 

19,10. 

8.6 

Non-white Females... 

18.2 


Greville’s analyses accompanying these figures (whii:h may be compared 
with those reached by Young’s Cocllicient of Krror as shown in this para¬ 
graph) draw attention to the generally sustained improvement over the 
entire period, to the favorable index for 1900 which emerged in conse¬ 
quence of age and date of birth (instead of age alone) being asked for in 
that census (see also T.A.S.A., XVlll, 265, and par. 42 here), and to the 
high indices for non-whites. I’he influences of increasing familiarity with 
the objectives of the census, the form of the age (juery, and the degree of 
literacy are also shown by the low indices for the total ])opulation of Kng- 
land and Wales, which in 1911,1921, and 1931 were only 4.4, 2.4, and 1.6 
res| 3 ectively (see Myers, loc. cit., 403 4, and compare the conclusions 
stated in par. 34((/) here). 

36. The object of all the preceding melliods is to obtain from the un- 
graduated data an approximate estimate of the errors which they contain. 
The true test, however, would of course be to compare the number ac¬ 
tually enumerated at each age with the correct number which should 
have been enumerated; and it has therefore been suggested that a close 













so 


Population Statistics and Their Compilation 

estimate will be given by graduating the enumerated populations, assum¬ 
ing that these graduated numbers arc correct, and examining the differ¬ 
ences between the graduated and ungraduated figures. This method, how¬ 
ever, is oixin to the objection that no graduated series can proj^erly be as¬ 
sumed to be a representation of the numbers which sliould have been 
recorded, in the sense that the differences between it and the ungraduated 
scries are only misstatements of age; furthermore, different methods of 
graduation vary greatly in their smoothing fx)wcr, and hence may produce 
very different estimates of errors. The extent to which such estimates are 
attributable to migrations rather than to misstatements of age will also 
be difficult to measure. For quantitative comparisons the results must of 
course be expressed as ijerccntages or ratios of the graduated populations 
to which they relate. 

The method may sometimes be used with caution, however, so long as 
the population curve has not been seriously disturbed by migration, and 
if a method of graduation is adopted which is designed to eliminate only 
the minor roughnesses which caimot be explained as the effects of migra¬ 
tion. Dr. Dunlop, for example, in J.R.S.S., LXXXVl, 550 et seq., esti¬ 
mated the errors in the age returns at the Scottish censuses of 1871,1911, 
and 1921 by adjusting the ungraduated populations between ages 17 and 
93 by osculatory inter[X)lation -the resulting deviations per thousand at 
each age being shown in tables of the following form: 


Kkrors per 'Fiiousand ln Returns of Aoe— 
Census 1921, Scotland, Males 


Tkns 


Unit Digit up Agk 


DIGIT 

OP Agf. 

0 

1 

2 

3 

4 

5 

6 

7 

K 

9 

10. 








-18 

+28 


20. 

+ 4 

+ 49 

- 8 

-36 

-25 

-28 

- 6 

-10 

+26 


30. 

+ 59 

- 45 

■sa 

-19 

-17 

K9Q 

+ 19 

El 

+ 14 

ESlI 

40. 

+ 55 

- 25 


-31 

-40 


-16 

Bi 

+50 

BS 

KiilHHR 

+ 75 

- 74 


-34 

+ 2 


+41 


+ 18 

ESI 

60. 

+107 

- 71 

ESI 

-32 

- 5 


-15 

-77 

+18 

Ess 

70. 

+ 36 

- 85 

+70 

-23 

-34 


+48 

+21 

-20 

-74 

80. 

+ 60 

- 76 

-11 

-41 

+70 

+15 

-55 

-36 

+39 

-23 

90. 

+ 94 

-168 

+22 


















The excess at ages ending in 0 and 8, and the disinclination to use those 
ending in 1, 3, and 7, is apparent. 

In consolidating such results the actual enumerated populations at the 
same digit of age are sometimes merely added and compared with the cor- 













The Reliability of Census and Registration Statistics 51 

rcsiwnding graduated pof>ulalioiis, as in the following table given by 
O. H. Knibbs in his “Mathematical Theory of Population” (Appendix A, 
1911 Census of Australia), p. Ill, which again shows a marked tendency 
to rejwrt ages ending in 0, 5, 6, and 8, and avoidance of digits 1, 5, 7, and 
9: 

Ratio of Numuer Rk('ori)kd to Adjusted Number, 

Censuses 1891,1901,1911, Australia, Males 


Unit FiumE in Auk Lvsr Birthuav 


Ykih t>K 


(’kNSL'S 

(I 

1 

2 


4 

.s 

6 

7 

8 

9 

1891... 

1.1.^88 

.9167 

1 .(K)88 

.««45 

.996*) 

1.0.?66 

1.0207 

.9515 

1.00.S5 

.9552 

PIOI.. . 

1.1041 

Mm 

1.0072 

.%77 

.9809 

1.0545 

1.0154 

.%56 

1.0144 

.9667 

1911.. . 

1.018.S 

.9956 

.<>944 

.9787 

.9990 

l.(X)85 

1.00*27 

.9691 

1.0191 

.96^)5 


This method of consolidation gives, for each digit, slightly greater 
weight to the earlier ages, since normally their |X)pulati(;ns arc larger than 
at the later ages. Tn any case where this may be a disturbing factor the 
jKipulations may be ecjualized at each age, in order to give the same w'cight 
to the earlier and later ages. This was done by Oeorge King in J.T.A., 
XLIX, 517 (and 502), as is shown in columns (2) and (5) of the following 
table there given: 

Knoland and Waives, M.\les, Aoes 10- 
89: Total Deviation at Kai h Digit 
OF A(;e per 100,0(K) of Graduated 
Population of Eacii Age, and the 

CORRESPONDlNc; KNUMERATED PoPU- 


L.vnoNS 

r..i.ii 

D( vi.it imi 

rnrrcsp'iiiiliiiK 

Di'.'il ««f A ;i‘ 

KiiiimtT.itni 

P-i)iu1:itiiiii 

(1) 

f.M 

! 




0. . . ’ 

;0,U89 

! 870,989 

1 

61.618 

1 758,.<82 


7,515 

807,515 

5’:;. ! 

-25,060 

! 774,«)40 

4... 

- 4,745 

7*)5,257 

5. 

2,4*)6 

802,496 

6 .. 1 

l..»46 

1 801,216 

7. . . ■ 

27,0*)4 

! 772,‘XX) 

8. 

25,.S84 

j 825,.SS4 

9 . 

- 7.U) 

! 7*)*), 264 


Tn now presenting in ratio form the ct)nclusi*)ns to wliich these figures lead 
we may either take for each digit the ratio of the enumerated population 




52 Population Statistics and Their Compilation 

in column (3) to the corresfionding graduated population, which is 800,000 
in each case, or we may use the ratio of the enumerated population at each 
digit to the enumerated iy)pulation at some particular digit—such as 
digit 9 as used by King. 'I'he relative results for the different digits are the 
same by these two methods, for the reference in the second method to 
digit 9 (say) in each case merely multiplies the ratios of the first method 
by the constant factor (800,000) 4- (7W,264). It is to be noted, however, 
that in King’s method the ratio 1.000 for digit 9 does not indicate correct¬ 
ness at that digit (as would be concluded if such a ratio appeared in the 
first method here, or in Knibbs’ table), and that the measure of inaccuracy 
at digit 9 and at each of the other digits should strictly be obtained from 
King’s ratios by multiplying by the factor (799,264) (80(),0()()).* The 

measures so obtained (for comparison with Knibbs’ Australian results), 
and King’s ratios to digit 9, are as follows: 

Excland axd Wales, Malfs, 1911, Aoes 1()-89 


Digit of Age 

1 

2 


4 

5 

6 

7 

8 

9 

Kalit> of Rcionled' 
to (Graduated; 

Number, on basis 
of 100,0()() (Irad- 
iiatcil at each 

Age. 1.0887 


l.(K)94 

.9687 

.9941 

1.0031 

1.0016 

.9661 

1.0320 

.9991 

Ratios to Digit 9,| 
as shown Ijy Mr. 

King.1.090 

.924 

1.010 

.970 

.9<>5 

1.004 

1.003 

.967 

i.a« 

1.000 


The order of digit selection here is 0,8 (2, 5, 6[9,4) 3, 7,1 for males. 

I'lie corresixmding order for females was 0, 8, 2 (4, 5, 6)|9, 3, 7, 1, as 
given in 'r.A.S.A., XXIX, 329.1'he numerical results shown in this table 
on the basis of ecfually weighted graduated ix)pulations arc extremely 
close to those obtained for the same data by Myers’ blended method of 
imr. 35(6) sec T.A.S.A., XLI, 414-15. 

'I'lie graduated results, by age and sex, of a 1% sample of the 1951 
I)opulation of Great Britain have recently been published in Part IT of the 
“CeiLsus of Great Britain; One per cent. Sample Tables” (1952). 

37. All the preceding indications of misstatements of age are derived 
from populations as actually enumerated—the numbers which should 
have been returned at each age being merely estimated therefrom. The 

* Digit 9 was used by King only in uider to compare the method of this paragraph 
with that of J.I.A., XLVlil, 208 (par. 34(6) of this Study). He indicates clearly the 
disadvantages of this selection of a particular digit. 





53 


The JRclhihilily of Census and Registration Statistirs 

misstatements so shown are therefore only approximate. It is of course 
not practicable to obtain complete statistics of actual misstalemenls, 
since it would be necessary to secure correcteil statements from all those 
who had originally given their ages incorrectly. It may be noted, however, 
that at the Australian census of 1911 ((1. II. Knibbs, toe. eit., pp. 112 -1()), 
7,n()() admissions of misstatement (mostly from women) were obtained 
as a sample, the actual amount of error being given in 1,660 cases; and the 
results w'ere analyzed according to correct age and amount of misstate¬ 
ment. Altliough the number of cases was small, and the. proixwtioii which 
the 7,0I!(' admissions bore to the total number of inisstalements in the 
census was unknown, the results are interesting as giving some indication 
of the form of the curves of actual misstatement, and as showing that 
94.64^'e of the aggregate cases of misstatements were understatements and 
only ; were overstatements (see also par. 53 here). 

Another approach to the pniblem, by which the actual misstatements 
are traced y)recisely, is of course the extension to all ages of the method 
used by Dr. Dunloji for children under 5 or 6 as described in par. 32 here, 
'rhis was done in the General Rejxirt on the 1921 ('ensus of New Zealand, 
where for 2,219 individuals (1,111 males and 1,11)S females) the ages 
stated (as of last birthday) in the census returns were actually checked 
with the birth registration recorils.* 

(3) Inclination To (hrrstate the .Igc until Majority^ Then To 
Understate^ and A ftencards To (hrrstate at Advanced Ages 

3S. In addition to the tendency to select particular digits, an inclina¬ 
tion is often manifested in census statistics to overstate the age slightly 
until attainment of age 21 (with concentration on that age, particularly 
by men - see, for example, Myers’ analysis in 'I'.A.S.A., XLI, 397 9S), 
then to understate at adult ages, and afterwards to overstate at the ad¬ 
vanced ages, 'riiese errors were called “major deliberate errors” ])y King, 
in order to distinguish them fmni the errors of digit selection which he 
named ‘‘jiiinor delilnTale errors.” 

'Flic nature of these “major deliberate errors” ma> vary considerably 
according to the laws and social customs under which the people live. 'Fhe 
regulations governing scliool attendance, and child labor laws, may invite 
overstatement at the juvenile ages arising from a desire to re|)ort an age 
which W'ill allow freedom from .such enactments; in most countries the 

* In analyzing this niaUThil, it is tn be inilcd that the cleluilcfl tallies (in pp. 9S 94 
of the General Report are corrcrl, but that a slight diserejiancy appears in ihe sum¬ 
marized table at the middle of p. 90. The error is pointed out in If. If. Wolfenden's 
discussion in T..\.S.A., XLli, 81, where these New Zealand statistics are also shown 
for ages 0 to 5 in Dr. Dunlop’s tabular form (sec par. 32iof this Study). 



54 Population Statistics amt Their Compilation 

attainment of marriageable age and of majority is often a supposed ob¬ 
jective; and in countries such as India these influences are further com¬ 
plicated by superstition and religious customs (see J.T.A., XXV, 242, and 
XLVII, 401 2). At the adult ages deliberate understatement has long 
been susi)e(:tcd, esjjecially among women. The widespread introduction 
of pension and social security legislation covering the older ages, such as 
65 and beyond, has produced in some communities a marked overstate¬ 
ment at those ages. In the final age groups, moreover, considerable over¬ 
statements are often found, which usually are attributed to a desire to 
share in the greater consideration and esteem which attach to extreme 
old age. 

39. 'I'he precise measurement of these errors, however, is often difficult, 
and their explanation frequently necessitates an attempt to ap[K)rtion the 
relative influences of various involved causes. Moreover, when there is a 
significant irregularity at s<*veral consecutive ages in the contour of the 
age diagram which might be explained by one of the causes enumerated in 
par. 3S, it can «)ftcn be argued that the irregularity might be due to migra¬ 
tions, and in the absence (generally) of complete and reliable migration 
statistics it is .sometimes difficult to establish or disprove that argument. 

The suspicion of considerable major deliberate errors at adult ages is 
usually sup]X)rted by the fact that in England and Wales, for example, the 
females aged 2()--24 (and sometimes over the wider range of ages 19 to 28) 
at each census since 1851 have always largely exceeded the numbers who 
could have survived from those enumerated at tlie previous census, and 
the same feature has been noticed in the United States (see J.F.A., 
XLIX, 9«>-l()l and 341, the General Reix)rt on the 1910 U.S. Census, 
p. 296, and the General Report on the 1921 Census of England and 
Wales, p. 80). 'Fhis |x*culiarity, also, is scmictimes associated with an aj)- 
parent deficiency in the numljers recorded at ages a1)ove 24 (and espe¬ 
cially, in the 1921 census of England and Wales, at the age group 34 to 
38) in comparison with the survivors exf)ected from the previous census. 
It has therefore been suggested that the excess at ages 20-24 (or 19-28) 
in conjunction with the deficiency at higher ages ix)ints to large deliberate 
understatements of age; and it is held by those who favor this view that 
only a portion of the anomalies can be satisfactorily explained as being due 
to migration (see J.I.A., XLIX, 100, 333, 337, and 341, the General Rc- 
ix)rt, 1921 Census of England and Wales, p. 80, and V. P. A. Derrick's 
analyses in J.I.A., LVIII, 117). 

At the 1911 census of England and Wales, however. King contended that 
this alleged understatement of age ^^docs not exist to any great extent, and 
that the ap|>arent excess in the populations enumerated at the younger 



The Reliability of Censm and Registration Statistics 55 

ages must be due to some other cause, and the only other cause Hint occurs 
to me is migration.” Ilis argument was based on the theory that if the 
ratio Px+i for the enumerated populations shows marked irregu¬ 
larities, then (tf) if these irregularities are due to deliberate and extensive 
misstatements of age (unacconiimnied by corresfxmding misstatements in 
the death records), there would be similar contortions in the curves of 
and the resulting life-table populations Ax, while {h) if they are 

due to migrations, so that the enumerated {xipulations “arc real, and the 
deaths recorded correspond to them,” then nh and Ax+i Ax would be 
smooth - as was actually found to be the case (sec XLIX, 3t)3-7, 

and discussion pp. .^28, 333, 341-42, 348, and 350). The assumption that 
discrepancies between the enumerated populations and the survivors 
exjxjcted from the preceding census may be ascribeil to migration was also 
used in pre|mring the Austrian National Life Tables, 1001-10 (J.I.A., 
LIll, 216). The suspicion that “major deliberate errors” do exist at the 
younger adult ages, however, is harilly yet satisfactorily disproved. 

40. The disturbances which have been noted in certain c'ensus data in 
the neighborhood of the ages covered by the introduction of [xmsion 
legislation have been considered, to give two instances, in connection 
with the 1921 census of England and Wales and the 1940 census of the 
United States. 

In the British data, comparisons by age-groups of the enumerated 1921 
|)opulalions with the survivors expected from the previous census sug¬ 
gested that the enumerated fjopulations of both sexes were deficient be¬ 
fore and in excess after age 64. These calculations have been interpreted 
“as indicating that a numlicr of Iversons of both sc.xes tend hereabouts to 
give their ages as somewhat in excess of their true value, leading to an 
overstatement of ix)pulation.” It was remarked, however, that this infer¬ 
ence is not conclusive, because although “the discrepancies from age 64 
upwards could be attributed to a series of overstatements of |)opulation at 
each census . . . they could equally be ascribed to a series of understate¬ 
ments diminishing with advancing age and terminating at about age 70; 
and since the period from 70 onwards is that covered by the Old Age Pen¬ 
sion Scheme under which grantees arc required to produce evidence of 
their ages, an explanation which is consistent with the assumption that 
the pcjpulation at ages over 70 is approximately correct api)cars to be 
preferable to one which assumes that very large overstatements have 
occurred” (see the General Report, 1921 Census of England and Wales, 
p. 80). 

In the analyses of the U.S. census of 1940, similarly, the enumerated 
populations were deficient at ages 55 to 64, and in excess at ages 65 and 



56 Population Statistics and Their Compilation 

over, in comfXirisr)n with the survivors exjxxrtcd from the previous census; 
and it was again suggested that ‘'the enactment of old-age insurance and 
old-age assistance legislation during the decade may have led to some 
overstatement of age in 1940 by fxjrsons actually 55 to 64 years old,” al¬ 
though “it is also possible that persons in this age range may have under¬ 
stated their ages in 1930” (see p. 3, Part 1, VuL IV, 1940 Census). The 
disturbances were so marked in the Negro data that special preliminary 
redistributions of those |X}pulations (and deaths) between 55 and 69 were 
made in the preparation of tlie life tables (sec T. N. E. Orcville, “United 
States Life 'tables and Actuarial Tables,” 1939 41, p. 110, and pars. 47 
and 52 here). 

41. 'riie tendency to overstatement at the most advanced ages is 
usually revealed by an examination of the statistics of ccntenaruins. Care¬ 
ful investigations of those who so report themselves arc often made by 
census authorities in Euroix;, and sometimes also in the United States; 
and it is frequently found that the stated ages are either diflicult to sub¬ 
stantiate or are delinitely exaggerated. 'J'he largest numbers claiming to 
be UK) years of age or over are generally found in |xjpulations which show 
high percentages of illiteracy, .as m.ay be seen from the t.ables on pp. 143- 
44, Suf)i)lementary Analysis, 12th U.S. Census, and p. 295, Vol. 1, 1910 
U.S. Census. In the U.S. ( cnsus of 1940, also, it was again observed that 
“the returns exaggerate the number of centenarians, particularly among 
iioji-whites” (Vol. IV, 3). 

42. As in the c.ase of iimccuracies in the re]K)rted ages of young children, 
two factors m.ay be suggested as the causes of these erroneous statements 
of age, namely, (1) the form of the age ^luery, and (2) illiteracy. 

With regard to the age query, the Tnteriiatii)nal Statistic.al Congress of 
1872 recommended that “where the degree of pojmlar intelligence per¬ 
mits, .age should be asked for by year and month of birth.” 'Hiis form of 
(juestion, or the even more exact method of recording the year, month, 
and day of birth, <ms recommended by the Royal Statistical Society’s 
Census Committee (J.R.S.S., LXXXTII, 138), is used in many Euro|)can 
countries, and it h.as the undoubted advantages tli.at an approximate 
reply to it is not easy, that it asks for a date which does not change, and 
that it does not instantly reveal the .age to curious enquirers. 'Flic census 
authorities in the United Stales, however, have generally asked for the 
age last birthday. Exceptions to this practice were the use of the nearest 
age at the 1890 census, and the double question of date of birth and age 
last birthd.ay in 19(K), which wjis discarded subsequently because it was 
then considered that the combination of these two ((ueries was rendered 
valueless by the practice of many enumerators of computing the date of 



57 


The Reliability of Census and Registration Statistics 

birth from the age given (see J.A.S.A., Xlf, 110; Young’s contrary 
opinion, loc. cil., p. 300, and Supplementary Analysis, t2th Census, p. 130; 
and the favorable indices for 1900 quoted in par. 35 here). In Great 
Britain the age last birthday was used until 1921, when a new method was 
intro<luced which asked for the age in years and months, on the ground 


- - 

— 


— 



Index of ('on- 




eentialion mi 



Date* tif 

Multiples uf 5, 

l*er (\*nt tif 


Census 

between Years of 

Illiteracy 



A^e 23 62 




Inclusive 



1. Ctiuntrii's Where Ari* Imiuiry Calleil 



for “Year of Hirlh” 


HclRium. 

PKM) 

100 

18.6 

(jermaiiy. 

im) 

102 






France. 

1Q01 

106 

14.1 

Austria. 

1«XK) 

111 

22.6 

Hungary. 

1<XK) 

US 

40.9 

Bulgaria. 

ms 

245 

65.5 


l. Countries Where A^e Imiuiry ('alletl 


for “Akc Last Kirthduy" 

United States. 

1910 

120 

7.7 

Knyrland and Wales.. 

m\ 

KX) 


Sweden. 

vm 

101 


Nelherlands. 

1899 

102 


Spain. 

PXX) 

159 

58.7 

Russia. 

1897 

182 

72.5 

Argentina. 

1895 

189 

54.4 

Brazil . 

\m) 

190 

85.2 


Niitk.—^T hc illitmify In varlmmly cvjirc-sMil :i> a imth-hIjuji- of llir imptilalinii 
ovtT 10, iir 13, or Slime inUTmrtliale aj;e, aecMnliiv^ lo the form «if llie sivaiLilik* 
reiMirl.s. The [H'liTnlaitrs for ArKeiiliiia ami nrn/.il arr l»asi-i| on iMipiilatifUis over 6 
anil 0 resperlively, ami must therefore Ik* user! wilh raiitinii in comiKirismi with 
the TM'rcentages for other couiilries. 

Hint this would ncccssil.Tlt' more t iireful rejilics tluiii :i (jucstion :is to aj^c 
last birthday only, and because the authorities were not convinced of the 
superiority of the query as to date of birtli (see also tlie last sentence of 
fiar. 34(f/) here). In tlie 1931 census t>f India the nearest a^e was adopted 
instead of the age last birthday, because under the jiarticular circum¬ 
stances of the Indian populatU)n it was then suggested that “the ages 
which the enumerators either guess or accept as correct are recorded with¬ 
out any consideration as to whether tliey arc ages next birthday or last 
birthday, and they may therefore be assumed to be the ages at the nearest 
















58 Population Statistics and Their Compilation 

birthday” (sec the review of H. G. W. Mcikle’s 1921 Re|)ort in T.A.S.A., 
XXVII, 470); for tlic 1941 census, however, Vaidyanathan recommended 
(see his Actuarial Rejwrt on the census of 1931, p. 136) the use of com¬ 
pleted years and months which, as in Great Britain, was held to be prefer¬ 
able because it forces attention to the (|uestion and to some extent at least 
discourages a careless reply. 

Although a considerable body of evidence thus exists wliicli indicates 
that the accuracy of the reywrted ages is partially dei)endcnl on the form 
of the age (yuery, it is also true that high concentration on quiiuiuennial 
ages is associated with a high fjercentage of illiteracy. This may be inferred 
from tlie data of pars. 32 and 34(6) here, and is also shown by the statistics 
in the table on page 57 (from Vol. I, 1910 U.S. Census, p. 292). 

(7) Persons of Unknown Age 

43. An indication of the number of })crsons whose ages arc returned at 
a census as “unknown” is given by the following table from the General 
Re|)ort, 1911 Census of Kngland and Wales, p. 88: 


Country, and Dale 
of ('ensiis 


Kngland and Wales, 1911... . 

Belgium, 1901. 

Denmark, 1901. 

France, 1901. 

T Tolland, 1910. 

Italy, 1901. 

Prussia, 1900. 

.Spain, 1901. 

United States, 1910. 


Tula I 

Number «if 

Kiiumi‘r:it(*d 

Ant'S 

Puj>ulati(in 

Not Slated 

30,070,192 

13,167 

6,W3,548 

8 

2,449,510 

6,0‘A) 

38,450,788 

116,772 

5,858,175 

81 

32,475,253 

1,112 

34,472,50«> 

6,741 

18,618.086 

20,698 

91,972,2Ti6 

169.055 


These returns of “unknown” age may be due to (1) the method of col¬ 
lecting the data, (2) illiteracy, or (3) the form of the age (tuery. Hie influ¬ 
ence of the method of enumeration is shown by a greater pro|)ortion of 
males than females of unknown age, which results from the fact that men 
are more likely than women to be away from home, and that consectuently 
the information concerning men is more frequently supplied by third 
parties who are ignorant of the true age. The effect of illiteracy is indi¬ 
cated by the higher ixjrccntages of unknown ages which arc often found 
in the more illiterate population groups (see, for example, the classitication 
by population groups of the 169,055 persons of unknown ages in the pre¬ 
ceding table, and the comparable analysis of illiteracy, on i)p. 295 and 
1187 of Vol. 1,1910 U.S. Census ReTX)rts). With regard to the age query. 











The Reliabilily of Census and Registration Statistics 59 

it is probable that when ilatc of birth is asked for there may be a greater 
tendency to state “unknown’’ than when llie age last birthday is re- 
(lucsted—^for in the latter case an approximate age rather than the reply 
“unknown” will more frequently be given without hesitation. In the 
United States censuses since 1910 the enumerators have been instructed 
to state the approximate age rather than return the age as unknown; and 
although this will reduce the numbers whose ages are returned as com¬ 
pletely unknown, it nevertheless undoubtedly tends to increase the con¬ 
centration on multiides of 5. 

In the United States census of 1940, the “age unknown” category was 
eliminated entirely from the published tables by a method developed by 
W. U. Deming (see W. K. neming, “'Hie l^limination of Unknown Ages 
in the 1940 Population Census,” Bureau of the Census). Where the age 
shown on the census schedule was unstated, partially stated, inconsistent, 
«ir illegible, a “master indicator” correlated the defective age with other 
information available on the schedule such as marital status, highest 
grade of school complete*!, em|)loyment status, ages of other members in 
the family, etc., a?id so supplied an c'stimated age. 'Phe system develops 
estimated ages consistent with the other information at a fraction of the 
cost formerly expended in entering unknown age classilications in the 
numerous published tables. The method was also adopted, with slight 
modifications, in the Canadian census of 1941. 

(5) I 'ndcr-rnumcraiion 

44. Mxcej)! w itli resi)ect to the data for very young children (as already 
discussed in ])ars. .?() 52), little information is available concerning the 
extent of any actual incompleteness of enumeration in cen.sus returns. 
'rh(‘ possibility of such under-enumeration is recognized and discussed, for 
examine, in the United States Life Tables and Actuarial 'Fables, 1959-41 
(pi). 1, 101, ami 102).* Tn those tables, however just as in the data of 

*'r\vo rcivnl ;malyse*s nf tiu; U.S. SclciMivc .Service dala fliy I). O. Price in the 
Xinerictin S<>ci(ilii.!;ic:il Ueview nf Prliriiary, PH7, anil K. J. Myers in the sune journal 
for June, P>I.S) have raisc«l llie suspicion llial siRnilicant iinder-eniinieration at the 
cerjsus may exist in some j'roups, although it is recognized, on the oilier hand, that the 
anomalies discussed in tlmse papers may result rather from over-registration in the 
selective service ligure.s. 

In order to estimate the extent of error in the 1950 U.S. population census, the 
Bureau of the Census has made an intensive small-scale sample check, through highly 
tr.'iiiied special eniimerali^rs, of approximately 22,(XM) households in a sample of areas, 
with the ohjecl of iliscovering any people who were missed in the census, any who 
should not have lieen hut actually were enumerated, and such du|)1icate enumerations 
as those at both permanent and temporary residences. The survey was also used to 
check the accuracy of various items, such as age, birthplace, occui)ation, industry, etc.. 



60 Population Slalislics and Their Compilation 

other countries where the censuses are taken through a carefully organizer] 
system -it is believed that in general the unreported cases constitute only 
a small and negligible percentage of tlie totals involved. Ordinarily, 
therefore, no attempt is made to introduce any direct adjustments for 
under-enumeration at the census, beyond those which are inherent in the 
other tyi)es of correction discussed in pars. 27-49 here. 

Krkors in Rkc;istrv\ti(3N Statistics 

45. In Great Britain (sec par. 24) and some European countries, in 
which systems providing for compulsory registration of births, deaths, 
and marriages h.jvc existed under centralized control for many decades, 
the returns arc reliable ami comjilete, so that usually they can be accepted 
without any correction for |)ossiblc under-registration. In “A Note on the 
Under-registration of Births in Britain in the Nineteenth Century” 
(Population Studies, V, 70), for example, T). V. (Bass has concluded, by 
the indirect method of estimating the ix)pulations at ages under 5 from 
the statistics of births and deaths (cf. i)ars. (A and 92 here) and then 
comtiaring the results with the actual census enumerations, that birth 
registration api)ears to have been complete in England and Wales since 
the census year 1881, and in Scotland since 1861. 

Tn the United States and Canada, however, wdiere nation-wide birth 
and death registration systems have been oi^rating for a comparatively 
short time and the transcri|)ts are derived from areas in which registration 
is believed to l)e only W i or more complete (see pars. 18 -23), it is obvious 
that the statistics of both births and deaths may be afTected by under¬ 
registrations of signilicant amounts. Until recently little was known of the 
actual extent of these omissions, although their imtxirtance as a source of 
error was of course fully realized. 

(/) Umlcr-rv^istralion of Hiritis 

46. With resfject to itndcr-regislralion of births, direct investigations 
were accordingly undertaken in connection with the 1931 and 1941 
censuses of Canada, and the 1940 census of the United States, with a view 
to obtaining some measure of the extent of the omissions. 

Tn Canada, a sample of 26,205 children under one year of age was taken 

which had been given «al the census in properly enumerated cases—the checks including 
subsequent matching with the independent records available in birth certificates, 
previous censuses, social security data, etc. M. If. Hansen (The American Statistician, 
VI, 7) has reported that the prcliiiiinary results indicate **a net omission from the 
census of between 1% and 15 % of the population; this is the difference between total 
omissions of a little more than 2%,, and duplications or other over-enumerations of 
less than 1%.” 



The RcliabiUty of Censm ami Registration Statistics 61 

for the nine Provinces from the records of the 1931 census, and the files 
were searched for their birth registrations. Of these children who were 
thus actually enumerated at the census, 88% were matched with their 
birth registrations—the total unmatched percentage of 12% being com¬ 
posed of Provincial percentages varying from 20% for Prince Kdward 
Island and Nova Scotia to 6% for Quebec. These unmatched percentages, 
however, are partly attributable to misspelled names, children having 
been adopted, illegitimate children who could not be traced, immigrant 
cliildren erroneously re|X)rted as born in Canada, and clerical errors by 
the matching staffs. For these reasons it was concluded that the real de¬ 
ficiency in the birth registrations was not more than half the percentage 
unmatched (see pp. 231-34, Vol. Xll, Monographs, Census of Canada, 
1931), which thus would be approximately 69^ for the whole country. At 
the census of 1941, the census data and birth records were again matched 
for a number of districts, with resulting estimates of under-registration 
amounting to 3% for the 1931-41 census jwiod and 2.5^’,' for 1936-41 
(see 1941 Census of Canada, Monograph No. 1; and comments by I). V. 
(Hass in Population Studies, V, 88). This figure doubtless has improved in 
the last few years as a result of the introduction of cash “family allow¬ 
ances,” which naturally have caused parents to make sure that births are 
duly registered. 

In the United States census of 19-W), s])ecial cards were prepared for all 
the enumerated infants who were under 4 months of age on April 1, 1940 
(the census date). These cards were then matched against copies of the 
birth certificates for all births rcjMntcd as having occurred between 
December 1, 1939, and April 1, 1940; furthermore, copies of all the death 
certificates of infants born in the 4-month period were matched with the 
birth certificates. By this means the under-registration of births was indi¬ 
cated to be about 8^%r-the comix)nents of this over-all percentage vary¬ 
ing for whites from 2().4Vr in Arkansas to 1%, or less in Connecticut, 
Minnesota, New Jersey, and New York, and for non-whites over a wider 
range (in which the highest figure, 59.7%., was for New Mexico). Other 
tabulations based on a sample with a more detailed racial classification for 
the United States as a whole showed under-registration of 6%, for whites, 
18.1%', for Negroes, and 24.9^/^ for other races (see R. D. drove, “Studies 
in Completeness of Birth Registration,” Vital Statistics Reix)rts, Vol. 17, 
No. 18; 1. M. Aloriyama and T. N. E. Greville, “Effect of Changing Birth 
Rales upon Infant Mortality Rates,” Vital Statistics Special Re{jorts, 
Vol. 19, No. 21; and T. N. E. Greville, “U.S. Life 'fables and Actuarial 
Tables, 1939-1941,” p. 103). Improvement in these figures was antici¬ 
pated as a result (among other reasons) of the increasing proportion of 



62 Population Statistics and Their Compilation 

births wiiich now occur in institutions. This improvement has been con¬ 
firmed by a second nation-wide test (see “Birth Registration Complete¬ 
ness, U.S., 1950” by S. Shapiro and J. Schachter—Public Health Reiwrts, 
LXVII, No. 6), which was made by comparing the birth records with the 
infant cards of the 1950 census (see par. 12 here). 

In making tests of this kind, the estimated percentage of under¬ 
registration is based on the number of infants enumerated in the census 
and not on the total infant population. Usually it is found that non¬ 
registration and non-enumeration arc correlated, so that the proportion 
of births not registered is greater among those infants not enumerated in 
the census than among those enumerated. This means that the estimates 
of the extent of under-registration are somewhat too low. C. Chandra- 
sekar and W. E. Deming (“On a Method of Estimating Birth and Death 
Rates and the Extent of Registration,” J.A.S.A., XLTV, 101) accord¬ 
ingly have suggested (on the basis particularly of the somewhat intrac¬ 
table material of India) that the correlation between non-registration and 
non-cnumcralion may be regarded as due to a heterogeneity in the i3opu- 
lation with regard to certain chiaracteristics which affect both types of 
rcix)rting. The correlation can therefore be minimized by dividing the 
fX)|)ulation into homogeneous grouijs, estimating the total birtlis sepa¬ 
rately for each grouf), and then obtaining the grand total by addition. 
'Phis suggestion undoubtedly is theoretically sound, although it may be 
diflicult to apply successfully in j)racticc because tabulations of the infant 
fX)pulation by those characteristics which have most influence on the 
quality of rejwrting are sometimes diflicult to obtain. The method has 
been applied to the PMO birth registration tests in the United States, 
using data for small geographic areas in an effort to secure homogeneous 
groups; except in a few States, however, the results did not differ appre¬ 
ciably from those liased on the assumption of zero correlation (see 
S. Shapiro, “Estimating Birth Registration Completeness,” J.A.S.A., 
XLV, 264, and comments therein with respect to additional tests jdanned 
for 1950 and the iwssibility that the method here under discussion may 
have “an appreciable effect on measures of registration completeness only 
in those areas with a comparatively high degree of under-registration”). 

(2) Undcr-registration of Deaths 

47. In connection with tinder-registration of deaths, as in the case of 
births, due weight must be given to the efliciency of the registration sys¬ 
tem. In the United States, for instance, registration in many of the 
southern and most of the western States was established much later than 
in the Northeast, and consequently is less well developed—especially in 



Tfte Reliability of Census and Registration Statistics 6.? 

the rural areas, and more particularly among Negroes in the South. Very 
little si^ecific information is available, however, on the extent of such 
undcr-registration, and no adjustment was contemplated in any of the 
Census Bureau’s life tables until the problem was examined carefully in 
the construction of the complete tables for 19.S9 41 (.see U.S. Abridged 
Life Tables, 1930 39 (Preliminary), p. 1; U.S. Abridged Life Tables, 1939 
(Urban and Rural), p. 5; and U.S. Life Tables and Actuarial Tables, 1939 
41, pp. 1,101,102, and 103-6). In that examination it was ixnnted out, of 
course, that if the deaths and the census enumerations (as to which see 
jiar. 44 here) arc both under-stated by the same percentage (which, how¬ 
ever, would hardly be expected), the unadjusted data of both would pro¬ 
duce correct mortality rates, but that undcr-rcix)rting in the deaths to an 
extent greater or less than in the populations would result in dellcient or 
excessive rales. Because information on this matter of relative incom¬ 
pleteness was “almost entirely lacking,” the 1939-41 tables were again 
constructed without any direct adjustment for under-registration of 
deaths, although the data were handled throughout by various methods 
(discussed elsewhere in this Study) which were designed to minimize any 
effects of relative incompleteness in the deaths and i>^pulations. An 
elaborate analysis of the problem was made, however, for the first year of 
life, because “there is evidence that the proix)rtion of infant deaths 
[under 1 year of agej not re|)ortcd is sufficiently large to have an appre¬ 
ciable effect on life table values.” Even there, the problem had to ])e ap¬ 
proached indirectly, by examining the deaths per 1,(XX) births (corrected 
for incomplete birth registration), in each of seven sub-divisions of the 
first year of life for each State, in relation to the completcne.s.s of birth 
registration in llie State; the figures so arranged showed anomalies which 
seemed to be explainable only on the theory that the deaths in that year 
“are affected by an incompleteness of rejwrling having, in general, the 
same geographical incidence as in the case of births”; and a numerical 
estimate was reached which suggested that while and non-while infant 
deaths were rcsixjclively about 9A.5% and slightly less than SP/f. reix)rted 
(see the 1939-41 Tables, p. 106). 

(J) Stillbirths 

48. 'Fhe chief difficulties which occur in connection with the birllis 
actually registered arise from the designation of stillbirths. '^Fhe definition 
and interpretation of the term “stillbirth,” and its differentiation from 
“live birth” and from “premature birth,” necessarily involve numerous 
ix)ints of uncertainty and controversy. Furthennore, the statistical 
methods adopted for the treatment of stillbirths obviously may have a 



64 Population Statistics and Their Compilation 

marked effect upon the resulting mortality rate shown by the first year 
of age.* 

The procedures adopted in different countries have followed several 
patterns. The definitions, for instance, have described a stillbirth as a 
child who has not lived “any time whatever, no matter how brief, after 
birth” (in the United States, Canada, and Britain), or as one dying within 
the two days before registration whether death be before, during, or after 
birth (as in France, Belgium, and Holland); and the distinctions between 
premature births and stillbirths are not uniform. The methods of record¬ 
ing stillbirthsf have also varied greatly—^for although their registration 
has been compulsory in numerous European countries for many years. 
Great Britain did not adopt actual registration of stillbirths until 1927 
(even though notification to the Medical Officer of Health had for a long 
time been one of the requirements). Tn the United States the practice until 
1939 was to register each case as a birth and also as a death, and in such a 
manner that they could subsequently be excluded from the tabulation of 
“living” births and deaths; since 1939, however, stillbirths have been 
registered on a separate certificate as a distinct entity (cf. par. 22 here). 

Under these circumstances the International Statistical Institute and 
the League of Nations recommended a definition, which is used in many 
countries, whereby a stillborn child means, in brief, one born after at least 
28 weeks’ pregnancy and without respiration occurring. The definition 
used in the United States, however, requires that only the 20th week of 
gestation has been reached. The unsatisfactory character of the general 
situation was summarized as recently as 1943 by F. E. Linder and R. D. 
Grove (“Vital Statistics Rates in the United States, 1900-1940,” p. 57— 
U.S. Census Bureau) in the statement that “irrespective of the definition 

* For example, under the practice adopted in many government reports of stating 
the so-called ‘^infant mortality rate” as the ratio of the deaths of infants under 1 year 
of age to 1,000 live births during the same calendar year— an approximate method only, 
as will be evident from formulae (15)-(18a) and (21)-(23) of Section VI herein (see 
also par. 79, footnote) -the inclusion of stillbirths in the data of both births and 
deaths would increase that rate in New Zealand from 38.74 to 68.08 in 1927, while the 
system then used in France, Holland, and Belgium by which deaths within two da>'s 
of birth were treated as stillbirths and excluded from the computations would have 
reduced the 38.74 to 27.40 (see “New Zealand- Infant Mortality Kates sind Stillbirths” 
by Malcolm Fraser [Government Statistician], J.R.S.S., XCII, 429 and 431). 

t For more detailed material see Newsholnie’s Vital Statistics, pp. 77-83; T.A.S.A., 
XIX, 139 and 141; the definitions of stillbirths, premature births (stillborn), and prema¬ 
ture births (not stillborn) in the Rules of Statistical Practice of the Vital Statistics 
Section of the American Public Health Association (Annual Reports on Mortality 
Statistics, U.S. Census Bureau, 1907 ct seq.); P. G. Edge, “Vital Registration in 
Europe,” J.R.S.S., XCI, 360; and Fraser’s paper mentioned in the preceding footnote. 



The Reliability of Census and Registration Statistics 65 

which is applied, there is no doubt that the registration of stillbirths is 
very poor in almost every country.” 

(4) Age at Death, Came of Death, and Occupation 

49. The reported ages at death, like the ages returned at the census, fre¬ 
quently show a marked preference for the digits 0 and 5, and for even 
rather than odd digits. I'hesc features, however, do not always occur at 
the same ages as in the census figures, and usually they are not quite so 
prominent (except in the more backward communities). They have been 
analyzed on a number of occasions in the reijorts of the Registrars-Gen- 
eral of h.ngland and Wales and of Scotland, by the methods already 
described in par. 34. Tn the U.S. Life Tables and Actuarial 'Fables, 1939 - 
41, Greville made an examination of the white and non-while deaths of 
1935- the most recent year for which they were available by single years 
of age—by Myers’ method (see par. 35 here); the investigation confirmerl 
the selection of 0 and 5 by whites and non-whites, indicated preference for 
2,4, and 8 by whites but not by non-whites, and show'ed less bias than in 
the census returns for the whites but greater bias among non-whites. It 
was also pointed out in the same volume (p. 111) that among Negroes the 
deaths (and also the populations) were so seriously misstated around age 
65 presumably as a result of the establishment of the social security 
programs that a special preliminary redistribution of the data was con¬ 
sidered advisable between ages 55 and 70 (see also pars. 40 and 52 herein). 
The general similarity between the ages selected in the f:ensus returns and 
in the death records may often be seen from the fact that a scries of death- 
rates calculated as ratios from the unadjusted death and census data is 
usually much smoother than either of the series of death or census returns 
alone. 

The statements by the physician rcs|XJcting the came of death arc 
affected by errors resulting from careless or incomydete certification - 
particularly failure to distinguish between the immediate and contribu¬ 
tory causes in such a manner that effective statistical use of the certificates 
can be made. 'Fhe registration authorities in Britain, the United States, 
and CaiiJida have given much attention in recent years to the develoj)- 
ment of plans by which tlie ph 3 ^ician in his medical certification shall 
state definitely the underlying cause to which, in his opinion, the death 
.should be charged statistically. The difficulties in thus recording, and 
later tabulating, the data with regard to cause of death are considered in 
Section Xf. 

The records of the usual occupation and the industry or business of the 
father and mother on the birth (and stillbirth) certificates, and the similar 




66 Population Statistics aftd Their Compilation 

questions on the cleuth certificate, are of course important for statistical 
purposes in studies of fertility and mortality by type of work and eco¬ 
nomic status, and therefore should be handled as closely as possible in 
conformity with the occupiitional classifications adopted in the census 
tabulations. This, however, is rendered difiicult by the fact that the state¬ 
ments of occufxition on the death certificate arc necessarily made by 
friends or relatives who may be unacquainted with the precise details or 
indifferent to their iin{iortance. The problems involved in dealing with 
the resulting errors are discussed in Section XII. 

(5) Deaths and Births of Non-residents 

50. A further question of considerable difficulty arises in connection 
with the deaths and births of non-residentsy and the children born of non¬ 
resident parents {or mothers). In order to determine accurate birth and 
death rates it is clear that the births and deaths which enter into the cal¬ 
culations should be those which corresixind strictly to the census popula¬ 
tions. That is to say, if in a particular locality the population actually 
enumerated by the census can be assumed to be the i)opulati()n normally 
“resident” in that locality, then the deaths registered therein of persons 
normally domiciled outside its boundaries (and enumerated elsewhere by 
the census) should be excluded, while the deaths occurring outside the 
locality, of “residents” who were enumerated within it by the census, 
should be included; and the births should be treated similarly. When the 
census aims to record each individual at his “usual place of residence” 
(his “usual place of abode”), as in the “de jure” melliod, the general prin¬ 
ciple would be to tabulate the births and deaths by the “usual place of 
residence” as that phrase is interpreted in the census instructions; but if 
the iX)pulation is enumerated by the “de facto” method the distribution 
by residence is not usually as accurate, and special adjustments of the 
census ix)pulations may be necessary as a preliminary measure in order to 
obtain an approximate figure for the “resident” population which will 
corrcs|)ond to such a tabulation of births and deaths according to their 
“usual residences” (see, for example, the Registrar-denerars Statistical 
Review of England and Wales, 1921, Text, pp. 91-93). 

In the United States, with its “de jure” censuses and the internal 
movements of its ixjpulation which result from modern travel facilities 
and its numerous urban medical institutions, the allocation of non-resi¬ 
dent deaths to their places of residence has been developed carefully by 
the Bureau of the Census in order to provide consistent death and census 
stsitistics for areas smaller than the country as a whole.* The general 

* The first (1906) of the annual volumes on Mortality Statistics included a discus¬ 
sion of the problem; total deaths for cities and counties were published by residence 



The Reliability of Census and Registration Statistics 67 

rules followed are that for births “the residence of the child is defined as 
the residence of the mother”; for decedents, those “who at the time of 
death had been living more than one year in a community arc consiilercil 
residents of that community even though some other place of residence is 
stated ” those “in general hospitals, tuberculosis sanatoriums, etc., arc 
reallocated to the place of residence,” and those “in mental institutions or 
other institutions where the duration of stay is usually long arc not re¬ 
allocated to the place of prior residence.” The importance of the problem 
may be judged from the fact that in 1940, for the United States as a 
whole, 14.8 jier cent of .all deaths were non-residents, while for cities of 
10,000 to 100,000 over 23 jier cent of deaths were non-residents and 36.8 
per cent of births were to non-resident mothers. It must be noted, how¬ 
ever, that of course such figures do not measure tlic net result of realloca¬ 
tion, bec.ause the non-resident births and deaths arc reducible Iiy the 
births and deaths of residents which occurred elsewhere, 'fhe general con¬ 
clusion has been reached in the United States that rates for individual 
States or groufis of registration States, and even State figures for particu¬ 
lar age, sex, or race groups, based on “place of occurrence” data, would be 
changed only slightly if the tabulations were made by “place of resi¬ 
dence,” but that care must be exercised in rcs|)ect of cities, counties, or 
other units smaller than the State (see F. K. Linder and R. I), drove, 
“Vital Statistics Rates in the United St.ates, 19(K) 1940,” i)p. 15 -IS). 

In Great Rritain the deaths in institutions have been referred to the 
place of usual residence for many years. In 1911 a comprehensive plan 
was adopted by which all non-resident deaths, in institutions and else¬ 
where, arc transferred as nearly as i^)ssiblc to the area of residence. Simi¬ 
lar rules are employed for the distribution of births according to the 
mother’s residence. A new enquiry as to the usual residence, moretiver, 
was included in the census of 1931, in order, amongst other objectives, to 
facilitate the determination of the “de jure” resident |K)i)ulation notwith¬ 
standing the “de facto” basis of the British census system (sec alsci par. 9 
Jiere). 

The annual Canadian reports on Vital Statistics have i)re.sented all the 
tabulations by place of residence since 1944, with clas.sification of the 
births by place of residence of the mother. Previously the data were 
shown by place of occurrence, although some si)ecial t.abulations according 
to residence were made for several years prior to 1944. 

in the volume fur 1914, and Ruh.scqucntly for cuch year from 19IS lo 19d0; in 1*M.S 
more detailcfl Uihiilatinns were given for liirlhs as well as deaths; and since 19.W still 
more extensive data have been made available. 



V 


PRELIMINARY ADJUSTMENTS FOR ERRORS OF 
AGE IN CENSUS AND REGISTRATION STATIS¬ 
TICS; AND ESTIMATES OF POPULATIONS 

Before taking up the main problem of the construction of mortality 
tables (Sections VI and VII) it will be well to consider two preliminary or 
independent adjustments which are sometimes required. The first of 
these concerns the elimination, so far as may be advisable in a special 
I)reliminary oj^eration, of some of the worst of the disturbing effects of the 
errors of age considered in the preceding Section IV; the second is the 
problem of estimating the populations living at some time other than the 
census date. 

(I) Preliminary Adjustments for Errors of Age 

51. Of the various types of errors of age dealt with in Section TV, the 
numbers at “unknown” ages (par. 43), when they are stated as a separate 
category, are usually absorbed into the general body of data by being 
distributed in i)ro]X)rlion to the fiopulations of known ages. 

52. The tendency to concentrate on certain digits of age, especially in 
extreme cases, is sometimes dealt with by a preliminary redistribution. 
One elementary instance of this has already been illustrated numerically 
in par. 34(c) for the sijecial purix)scs there mentioned. 

In the preparation of the Austruin National Life Tables discussed in 
J.I.A., LIII, 225, the ]X)pulations and deaths were redistributed over the 
three ages comprising the age showing the concentration and one on each 
side of it, by taking the average of the three as the adjusted central value, 
and by correcting those on each side on the assumption that the error in 
each was proixirtional to the number observed and that the total of the 
three values should remain unaltered. 

In the United States Life Tables and Actuarial 'Fables, 1939-41 (pp. 
111-12), the marked disturbances in the Negro populations, and in the 
deaths, between ages 55 and 69 inclusive (to which references have been 
made in pars. 40 and 49 here) were dealt with by separate preliminary re¬ 
distributions. The ratios of the Negro |X)pulations or deaths to the cor¬ 
responding white {X)pulations or deaths were calculated for the six age 
groups 50 and over, 55 and over,..., and 75 and over, whence corrected 
ratios for 60 and over and for 65 and over were inserted in the series by 
interix}lation from the other four values by Waring's* (Lagrange’s) 

* Edward Waring’s priority as the discoverer (in 1779) of the interpolation formula 
to which Lagrange’s name is usually attached was pointed out by T. N. E. Greville 

68 



Preliminary AdjmtnunUs for Errors of 69 

formula, and then the adjusted values for the five quinquennial age- 
groups 50- 54, , 70-74 were obtained by difTereiicing. This method, 

whicli gives exactly the same numerical results as the flircct application of 
the method of par. 34(f) to the five quinquennial age-groups themselves, 
l(*aves unchanged the values at each end of the scries of five age-groups 
between ages 50 and 74, and reproduces the total for the three in the mid¬ 
dle (for ages 55 to 69) and also the total for the five between ages 50 and 
74. 

Another type of sixirial adjustment, which may also be recorded here, 
was emi>loyetl in presenting comparative life-table values for various 
countries in the United States Life Tables and Actuarial 'I'ables, 1939 41 
(p. 14 and footnote), in order to correct certain ungraduated rates of 
mortality for Mexico which were overstated at ages emlingin 0 aiul 5 in 
consequence of a marked preference for those digits in recording the ages at 
death -the number of deaths in the life-table cohort at each ((uiiuiuennial 
age, y, from 15 to ‘X) being determineil by taking the total deaths at ages 
y to y + 4 as correct and then assuming the deaths at age y to be the same 
fraction of the tot.'il deaths at ages y to y + 4 as they were in a graduated 
life table for Mexico (prepared by S4)ldrzano and Mortara). 

Other examples occur in several actuarial re|K)rts dealing with the cen¬ 
suses of India. In J.I.A., XLVll, 320 (following the earlier reix)rts of 
O. V. Hardy), T. (1. Ackland matle a rough adjustment of the abnormal 
concentration shown at ages ending in 0 and 5 in the Indian returns by 
the following method: 'riie number returned at age group 0 4 was in¬ 
creased by onc-half the number recorded at age 5; gri)up 5 9 was adjusted 
by corresixjiidingly de<lucting the number already transferred to group 
U 4, and by adding one-lialf of the excess of the number at age It) over the 
mean of the numbers at ages 9 and It; group It) 14 was then adjusted by 
correspondingly deducting the one-half (already transferreil to grouj) 5 9) 
of the excess of the number at age 10 over the mean of the numbers at 
ages 9 and 11, and by adding one-half of the excess of the number at age 
15 over the mean 4 jf those at 14 and 16; and the subsecpient r(uin(|uennial 
groups were treated by the same method as group 10 14.* 

A more elaborate procedure was evolved subsequently by II. Ci. \V. 


(with acknowledgment to \V. K. Demingaiid O. ('. I'raserJ iii llie .\niials nf Mallieniati- 
r:il Statistics, XV (1944), 21S. A note on this prif»rity, and on Waring’s life an»I work, 
is given liy If. If. VVolfendeii in T.A.S.A., XLVI, 97 9.S. 

* fn thccumpututiuii shown Iiy Ackland on hi.sp. .^21, the sec md line of the forrniilae 

should be S Ubn+t — \ (A*i/ 6»+4 — with column (4) headed and 

0 

column (5) A*M 5 h i. 



70 Poptilalion Statistics and Their Compilation 

Meikle in his .actuarial report on “The Age Distribution and Rates of 
Mortality Deduced from Ihe Indian Census Returns of 1921 and Previous 
Enumerations” (reviewed in T.A.S.A., XXVJI, 467). Taking as an c.\- 
ample the data 


Ye;ir ff Awe 

XiiinbiT Kcpiirtcd 

Keciirdcd 

:it the C'cnsiis 

58 59. 

. 130 

59-60. 

. 56 

«) 61. 

. 2,5(»9 

61-62. 

. 58 

62-63. 

. 213 


2,%6 


which show the tremendous concentration on the age ending in 0 , and 
treating the ages (which were intended to be as of age last birthday) as 
being more closely nearest ages (see |>ar. 42 here) so that tlie whole 2/)b6 
could l)e viewed preferably as lying between actual ages 57J to 620-, 
Meikle sup|X)sed that in this range half of the 2,509 would be aged 571 to 
60 and half aged 60 to 62 . 2 -; ^^ext assuming that would be aged 59-2 
to 60 - 2 , that would represent approximately those, between 

60J and 611, ‘^-*^1 that — 2 ^ 6 o) would be aged 611 021 , and that 

similarly 4 . would be between ages 581 and 591, and 

(1 + 2 ^ 60 ) between 571 and 58.}, and further taking half of the assumed 
aged 59} to 601 as being aged 59} to f)!), he ()btaiiu?d the corrected 
total of the whole group under 60 as 2 ®^^[1 + (1 + (/«») -f- d + 2(/r,o)] ~ 
2,966(.S + .6^f,o), and therefore 2,966(.5 + .O^ro) (56 + 150) as repre¬ 
senting the shortage at those ages. 

Preliminary redistributions for errors of age, such as those just de¬ 
scribed, have usually been applied only when the disturbances arc very 
marked. Their im|X)rtance should not be under-estimateil, however, be¬ 
cause the digit misstatements of age with which they are intended to deal 
are, in fact, generally systematic and cyclical even in reliable material, so 
that the data uncorrected by preliminary redistributions are often af¬ 
fected by an inherent waviness which is dillicult to handle satisfactorily 
by any subscejuent graduation method (see also Wolfendcn’s remarks on 
the principle of preliminary redistributions in T.A.S.A., Xldl, 85). 

53. Grouping,- —Y\ic more usual method of dealing with the concentra¬ 
tion on particular ages is to group together the data for certain adjacent 
ages so that the total of each group may be assumed to be approximately 
correct -these group totals, instead of the values at individual ages, then 
being employed as the basic data. This principle requires that the age at 
which concentration occurs shall be in the siime group with the ages from 
which that concentration is drawn. The selection of appropriate groups in 










71 


Preliminary Adjustments for Errors of Age 

this manner would not be difficult if, for example, concentration at ages 
ending in 0 were the main feature; for then the groups 15-24, 25-34, etc. 
(which were used for many years in the treatment of the vital statistics of 
Kngland and Wales, and arc still employed in the re[)orts of some coun¬ 
tries) would be satisfactory so long as it could be assuineil that the con¬ 
centration was not drawn from ages more than four or five years fiom the 
decennial i)oint. And where there is concentration on ages ending in 5 as 
well as in 0 the analogous qiiin<|uennial grouping would be 13-17, 18 -22, 
etc., in which the main point of concentration is again central to each 
group. 'Phis fiuinquennial arrangement will be referred to as the “3-7’’ 
grouping. 

Another grouping which is used extensively in census and death returns 
is that in which the quincjuennial points are at the beginning of each 
grouj), as in 15 -19, 20-24, etc., which may be called the “5-9” grouping. 
'Phis method has been criticized because it assumes implicitly that the 
heaping at each age which is a multiple of 5 is drawn solely from the ages 
above and not at all from ages below. It has been justified, however, in the 
73nl Annual Report of the Registrar-deneral of England and Wales (p. 
ix), and since then has been used for some of the death statistics in those 
reiK)rts, because it was found that “the heaping up of deaths at ages which 
arc multiples of 10 is caused mainly by transfer from the next succeeding 
year of age in each case.” h urlher sup|X)rt was provided by A. A. Young’s 
analysis for ages 20-55 in the 18*X) U.S. |K>pulation and 20-65 in the l‘XX) 
enumeration (see “The Com])arativc Accuracy of Different Forms of 
Quin(|uennial Age Grouiis,” J.A.S.A., VII, 27, and Supi)lemcntary Analy¬ 
sis, 12th (.’’ensus, j). 138); by testing the regularity of progression of the 
grou[) totals by the smoothness of the ratio 

Gx-b I fi 

where Cx is the group total for ages x to .v+ 4, he concluded that the 
“5 -9” grouping (with the (|uin(]uennial year of concentration at the be¬ 
ginning) there produced totals which progressed more regularly than 
either the “3-7” grouping (with the year of concentration in the middle) 
or the “1-5” grou[)ing (in which the quincjuennial point of concentration 
is at the end). I'he “5 9” method was also preferred, after due examina¬ 
tion, for the 1931 census data in England and Wales (see T.A.S.A., 
XXXVIT, 248, and Myers’ confirmation in T.A.S.A., XLl, 406), and in 
Scotland (sec the Supplement to the 78th Annual Rejxjrt of the Registrar- 
General for Scotland, Part T, Life Tables, p. 4). For the 1940 U.S. popula¬ 
tions, moreover, the “5-9” grouping was the best for white males (see the 



72 Population Statistics and Their Compilation 

U.S. Life Tables and AcluarialTables, 1939 41, p. 122, and note Greville’s 
additional reasons stated in par. 54 here for adopting that method 
throughout). 

Although one of the three (]uinquennial groupings just considered is 
generally adopted for convenience in presenting the unadjusted data of 
census and registration rc|X)rts, it is clear tliat there are also two other 
quinquennial arrangements, namely, the ‘‘2-6” grouping (tliat is, 12-16, 
17-21, etc.) and the “4-8” method (comjM^sed of ages 14-18,19-23, etc.). 
The |X)ssible advantages of these groups were not considered in early in¬ 
vestigations, because attention was then directed mainly to tlie im- 
iwrtance of minimizing only tlic lieapings at ages which are multiples of 
5, and therefore naturally groupings were suggested in which tlie digits 
5 and 0 were at the beginning (to deal with understatements), in tlie 
middle (for evenly distributed understatements and overstatements), or 
at the end (for overstatements). Now, liowever, it is recognized that 
marked concentration occurs also at even digits, and es|x?cially at digit 8 
(see pars. 33 et seq.); and investigations have therefore been made to 
determine which of these live quinquennial methods of grouping will give 
the most satisfactory results in the light of the known tendency to con¬ 
centrate on ages which are multiples of 2 as well as of 5. 

54. (i) This question may be examined, firstly, by extending Young’s 
ratio method of the preceding paragra[)h to the “2 6” and “4 8” group¬ 
ings as well as to tlie others. 'Phis was done by J. W. (ilover in the case of 
the 1910 fxipulations and 1900-11 deaths of males in New York State • 
the regularity of the ratios for each method of grouping being tested by 
calculating the .iverage dilTerence betwt*en the ratios for groups including 
ages ending in 0 and for those including ages ending in 5 (see U.S. Life 
Tables, 1890, 1901, 1910, and 1901-10, p. 362). The results confirmed 
Young’s conclusion as to the suiieriority of the “5-9” method over the 
“1 5” and “3-7” groupings, but showed also that the “2-6” and “4-8” 
methods are practically as good. 

(it) Another test used by George King in e.xamining the data of the 
1911 Kiiglish cen.sus (sec his Rc|xirt on the Graduation of Ages in Vol. 
VII, p. xl), and more exhaustively by P. C. II. Papps for the data be¬ 
tween ages 30 and 40 of the U.S. registration states in 19(X) (“Effect of 
Grouping in Graduation by Osculatory Inlcriiolation,” J.A.S.A., XVI, 
190), was to calculate the values at individual ages by osculatory inter¬ 
polation on the basis of each of the five possible groupings, ami to con¬ 
sider the most satisfactory grouping to be tliat which gives the best inter¬ 
polated results. As a criterion of “best” Papps emjiloyed the third differ¬ 
ences of the inteqx)latcd values, and also the deviations of the interpo- 



Prelimimry Adjustments for Errors of Age 73 

lated values for each mode of grouping from the average of the results of 
the various groupings. Glover (loc, cit.) also applied the same principle to 
the 1910 New York State data, and used two further tests, namely, the 
deviations between the actual and expected deaths, and the weighted 
squared deviations between the graduated and observed rates of mor¬ 
tality. As a result of these investigations King found the “5 9*' grouping 
to be unsatisfactory, and pronounced in favor of the “4-8** method; 
Papps showed the “5 9** method to be inferior, and “3 -7** in general the 
best the “2-6** and “4-8** groupings also being good; and Glover dis¬ 
carded “1-5** and “5 9,** but found that “the decision as to groups ‘2 6,* 
‘3 -7,’ and ‘4-8* still remains a problem.*’ In finally deciding, as did King, 
in favor of tlie “4—8** method Glover said: “As between these three 
groups it will be observed that groups ‘2 6* and ‘3 -7* contain both the 
ages ending in the digits 0 and 8 in the siimc quinquennial age group, 
while the adjacent five-year groups contain the ages eniling in the digit 5. 
'rhis tends to exaggerate unduly alternate quinquennial age groups in 
these sets. With the group ‘4 8,* however, the ages ending in the digits 5 
and 8 are in the same (luinquennial group and the ages ending in the digit 
0 are in the adjacent five-year groups. Since the exaggeration for ages 
which are multiples of 10 is undoubtedly greater than for ages which end 
in the digit 5, the group ‘4-8* would seem to furnish a better balanced 
grouping than the group ‘2-6* or ‘3 7.* ** (For the |X)pulations of ICngland 
and Wales in 1921, moreover, the “4-8** method was held [in the General 
Reix)rt and Appendices, for a review of which sec T.A.S.A., XXIX, 330] 
to be superior for females, although “2 -6*'’ was preferred for males; in pre¬ 
paring Knglish Life Table No. 9 therefrom Sir Alfred Watson adopted the 
“2-6” grouping as being advantageous for both se.xes. The “2 6” method 
was also indicated by Myers* analysis [see T.A.S.A., XLI, 406, and par. 
54(iff) here] as being slightly better than “4-8” for the U.S. Census data 
of 1910, 1920, and 1930.) 

(ill) A third method pro|3osed by R. J. Myers (and adopted by Grcville 
in preparing the 1939-41 U.S. Life Tables) emerges immediately from his 
“blended” measure of digit preference as described in piiT. 35 (ft) of this 
Study. Since the extent of the concentration or deficiency for each digit is 
shown therein as a percentage (so that every percentage would be 10% if 
no digit selection were present), it follows that the best grouping of five 
digits would be that for which the sum of their five percentages is closest 
to 50%. Myers thus showed the superiority of the “5 -9” arrangement for 
the 1931 census of England anti Wales, and of the “2 6” method for the 
U.S. census data of 1910, 1920, and 1930, as already noted (sec T.A.S.A., 
XLI, 407); and by the same principle Grcville found that for the 1940 



74 Population Statistics atid Their Compilation 

U.S. ix^pulations the best groupings were “4-8” or “S-9” for whites and 
“4-8” for non-whiles, while for the 1935 deaths “1-S” was indicated for 
whites and “2 6” for non-whites (see U.S. Life Tables and Actuarial 
Tables, 1939-41, pp. 121-22). 

(«•) 'riie.se grouping methods are generally ap])lied to separate tabula¬ 
tions of populations and deaths, and the tests of digit concentration and 
most advantageous grouping are made for various reasons -for example, 
to examine the general character of the material, the cfliciency of the 
enumerators and registrars, the effectiveness of the form of the age query, 
the extent of illiteracy, and the comparative accuracy of age reixirting in 
urban, rural, and different racial communities. In such instances it is not 
usually necessary to maintain any special consistency in the analyses 
which thus are applied separately to the populations and deaths. How¬ 
ever, when the separate groups of f3opulation and death data are to be 
used for the construction and graduation of mortality table ratios such as 
W/ or qx, it becomes important to select groupings which will be consistent 
to the extent that they will not of themselves exaggerate, maintain, or 
produce waves in the nix or qx curves. Thus in preparing the 1926 life 
tables for Northern Ireland, quinquennial groupings showed waves in the 
qx curve as a result of more pronounced heaping of deaths than of ix)pula- 
tions at digit 0, and in consequence a decennial “5 -14” grouping was 
adopted (sec the RegLstrar-Gcncrars Review of Vital Statistics of North¬ 
ern Ireland and Life Tables, 1926, p. 53, and I'.A.S.A., XLII, 84). In the 
1939-41 U.S. life tables also, this desirability of using groui>s of {X)pula- 
tions and deaths which would minimize waviness in the qx curve was dis¬ 
cussed in the following words (see Grcvllle, U.S. Life Tables and Actuarial 
Tables, 1939-1941, p. 121): “In computing rates of mortality, if the same 
grouping is to be used for both populations and deaths, it is of little avail 
to select the most effective grouping for {populations if this grou|ping {pro¬ 
duces marked bias in the death figures, and vice versa; on the other hand, 
the correct mortality rates will be obtained, even with considerable error 
in both population and death statistics, if both arc deficient or both exces¬ 
sive in the same {pro{XPrtion.” The best grouping for the mortality rate 
calculations was therefore selected as that in which the smallest difference 
ap{)earcd between the best “blended” {percentages for the po{Pulatlons and 
deaths as found by Myers’ method of the preceding sub-par. (iii) here; 
for example, for white males the blended {Percentages by the five different 
groupingsTwere 



Preliminary Adjustments for Errors of 


75 


(jRriuPiNt: 

1 Blended Pexcentaoes mi 

PdlHilalitins 

Deaths 

Diffrrenrr 

1-5. 

40.5 

50.1 

.6 

2-6. 

.«;0.4 

.SO. 8 

.4 

.. 

40.6 

.SO.f) 

1.0 

4 8 . 

40.0 

50.K 

.0 

.S-9. 

50.0 

50.4 

.4 


from which tlic “S-9” grouping emerges as the best for computing mor¬ 
tality rates because the difiference (with that for “2 6’*) is smallest and 
both the i)crcentagcs themselves are better than for the “2 6” arrange¬ 
ment. 

(r) All the methods considered in this and the preceding f)aragraphs 
are based on c(|ual quinquennial or decennial groups throughout. De¬ 
partures from this system of C(]ual grou|)s, however, Ikpt been examined 
carefully as a means of dealing with the intractable material of fndia. In 
his re|x»rt on the 1921 Indian census (see fnirs. 34(6), 42, and 52 here, and 
T.A.S.A., XXVII, 470), H. (J. W. Meikle discarded the “2 6’* method 
which appeared to he desirable if equal quinquennial groups had bc^en 
adoi)ted, and recommended instead that for the census of 1931 a new sys¬ 
tem should be employed in which the groups (based on the assumption 
that the ages stated in the returns .should be taken as really being nearest 
ages see par. 42 here) should consist of “three and seven ages alternate¬ 
ly, acconling as the middle age of each group is an odd or an even multi|)lc 
of 5.” The advantages of this pro|)osal (which may be identilied as a 
“4 6; 7 13” arrangement) were stated to lie in the simpler formulae 
(given in the re]X)rt) hjr obtaining a regrouping of the figures according to 
the “S 9” method as is sometimes desirable for comparisons with other 
data -and in the fact that it af)iK*arefl to produce, in the particular case of 
India, a more reliable series of values of 7, for fiiml graduation than the 
“2- 6“ method. 

L. S. Vaidyanathan, however, in his rcix)rt on the 1931 census ex¬ 
pressed the view that actually “the numbers returned at each age would 
have been the same whether ages next birthday or last birthday or nearest 
birthday had been asked for”; consequently he invc.stigatcd the results of 
Meikle’s “4 6; 7 -13” method in conjunction with his nearast birthday 
assumption, and also of another arrangement of unequal grou|)s of four 
and six ages based on “3-6; 7 42” and the assumption that the ages re¬ 
corded at the census were ages last birthday. After examining these two 







76 Fopulaiion Statistics and Their Compilation 

unequal grouping plans as well as all the five usual equal quinquennial 
arrangements, Vaidyanathan concluded that the “2-6” method of equal 
groups was the best of all for the particular errors encountered in the 
Indian returns. 

The numerous investigations respecting the best methods of grouping 
which are recorded in the preceding paragraphs have been based on widely 
different data and have led to varying decisions. In all cases the conclu¬ 
sions reached were simply those which appeared, on statistical and gen¬ 
eral grounds, to be acceptable for the particular material at hand. No 
general rules, therefore, can be laid down (cf. II. H. Wolfenden’s remarks, 
T.A.S.A., XLII, 83-85). 

(2) Estimates of Populations 

55. As stated in pars. 7 and 11, censuses arc not generally taken on 
December 31 or January 1, while birth and death statistics arc usually 
tabulated by calendar years. It is therefore frequently necessary, before 
undertaking the construction of a mortality table from such data, to re¬ 
late the populations as well as the births and deaths to the beginnings or 
ends of calendar years. A similar problem is encountered by vital statisti¬ 
cians, independently of the construction of mortality tables, in the prejm- 
ration of annual reports on vital statistics which necessitate the calcula¬ 
tion, for a variety of purposes and areas, of estimated populations for each 
year since the last census; and such calculations have often been extended 
to the prediction of the populations which may be expected to exist many 
years from the present time. The term inlercensal population may be ap¬ 
plied to a population figure calculated for a date anywhere between two 
dates for which the actual census returns are available; while a postcensal 
population is one computed for a date subsequent to that of the last 
census. 

56. Another type of estimated population, namely, tlie average or 
mean population, is also required sometimes when a table is constructed to 
represent the mortality which has prevailed during a calendar year, or 
during a series of calendar years between two censuses (see par. 85). This 
“mean population,” in respect of a certain community for a given period, 
may be interpreted as the equivalent of the number of such periods of life 
which were actually lived by the members of the community during that 
period. “Thus the statement that a mean population of a given town 
for a certain month was 1,000,000 should mean that during the month 
1,000,000 months of human life had been experienced in the town” 
(C. H. Wickens, J.I.A., XLIII, 67); and in the usual case of one year being 



Estimates of Populations 


77 


considered as a unit the expression “years of life” is consequently often 
used to denote the mean |)opulalion multiplied by the number of years to 
which that mean ix)pulation relates. Such a mean |X)])ulation may be 
expressed very conveniently in terms of the intercensal ix)pulations by tlie 
integral calculus; for if the intercensal i)o[)ulation at any time / since the 
last census is, say, Pt when one year is taken as a unit, the mean popula¬ 
tion for, say, a 10-year period will be 





or if the entire ten years arc taken as the unit the equivalent expression 
would be 

f'/yi 

•'o 

in which P[ is the |)opulation at time / with ten years as the unit. 

57. 'rhe most obvious method of computing intercensal, iM)slcensal, 
and mean iMjpulations for the whole of a country is the siatisticul method, 
by which a continuous statistical record of the movement of iM)i)ulati()n 
is maintained, and the total [xipulation at any date is the Number at 
Preceding Census + (births + Immigrants) — (Deaths h Kmigrants). 
This method has been used for many years in Sweden (where the estimates 
have been remarkably accurate) and in a number of other Kurojwan coun¬ 
tries; Australia and New Zealand have emf)loycd it for a long time; and it 
has been adopted in Creat Britain, and in the United States siiK'c the 
!)irth and death registration areas were extended to cover the entire 
country. 1'he greatest dilliculty in its ai)plication is the incompleteness of 
the statistics of migration, and es])ecially of emigratitjn; for although 
many countries record particulars of immigrants with reasonable accuracy 
l)y ])ersonal examination at the place of entry, the departures from a 
country are not always supervised, so that many emigrants escai)e notice 
altogether and can only be traced imijcrfectly if they arc re|)orled as 
immigrants into another country. On this account it is sometimes as¬ 
sumed, where birth and death registrations arc reliable, that any discrep¬ 
ancy between a census iX)pulation predicted by this method and the ac¬ 
tual census figure is attributable to “unrecorded departures,” and that 
consequently the recorded departures should be increased in the ap¬ 
propriate ratio (cf. Wickens, XLTII, 61, and J. S. Thomjjson, 

T.A.S.A., XIX, 260). In applying the method for estimating the fxjpula- 
tiem of the United States, the birth and death statistics arc corrected by 
the Bureau of the Census for under-registration (sec the Bureau’s “Ksti- 
mated Population in Continental United States, by Age, Color, and Sex, 



78 


PopuUUioH Statistics atid Their Compilation 

1940-1942,” Population Special Reports, 1944). A recent discussion of 
several asj^ects of the problems involved may be found in a paper by 
J. S. Siegel and C. H. Hamilton on ‘‘Some Considerations in the Use of 
the Residual Method of Kstimating Net Migration” Q.A.S.A., XLVII, 
475). 

Tn determining a mean i3opulation by this method the integral as in 
par. 56 would be computed from the yearly (or half-yearly, quarterly, or 
monthly) intercensal jjopulations either by a formula of a[)proximatc 
integration, or by the less accurate assumption of linear progression used 
by J. S. Thompson in T.A.S.A., XIX, 265 and 268, and stated by Wickens 
in J.I.A., XLTTT, 67, under which, for example, the mean population for 
a ten-year period could be obtained from the intercensjil ix)pulations as at 
the beginning of each year by taking one-tenth of the sum of the yearly 
means. When there are marked seasonal fluctuations in the migration, as 
in the case of New Zealand, it may be desirable to weiglit the intercensal 
populations to allow for such iluctuations (cf. E. P. Neale, “A New 
Zealand Study in Seasonal I^'luctuations of External Migration, wnth Si)c- 
cial Reference to the Computation of Mean Annual Populations,” 
J.R.S.S., LXXXVl, 226). 

In applying the statistical method to various age groups, the f)rocess 
may be applied to each group separately, and the total of the estimates 
for all age groups would then be taken as the estimated total population. 
An alternative procedure for effecting the distribution by age groups, 
however, was used in the 1911 Australian Life Tables, where the scale of 
distribution expressed by the ratio vtlPu where it* is the i)opulation of a 
particular age group, was assumed to be a function of /, say, /(/), and suf¬ 
ficiently accurate results were obtained by assuming linear progression, so 
that /(/) = 0+6/, in wliich case the mean i^pulation over n years for an 
age group, being 



becomes 

ir(a+ 60 /Jrf/. 

which is evaluated by calculating a and h for each age group from the age 
distributions at the initial and terminal censuses and the integrals 

/ ij dt and CtP^ dt 
-/o' -/o 

by approximate integration (see 1911 Census of Australia, Vol. T, pp. 
85-87). 



Estimates of Populations 79 

58. While this statistical method will frequently give close estimates 
of population for the whole of a country, it cannot be applied directly to 
the populations of particular localities therein, since no migration figures 
covering movements within each country arc generally available. Special 
procedures arc therefore necessary for the subdivision of the national 
figure into estimates of its comiwncnt parts. 

The first such methods for estimating local populations were based on 
the theory that the i^opulation of a particular locality could be assumed to 
bear some definite relation to the number of inhabited houses or family 
dwellings in the district, or the number of births or the children attending 
school in certain grades, or to economic data such as the number of water, 
gas, or electric meters. This theory was applied originally by taking ac¬ 
count of only one such factor; and although it then reflects only one of a 
number of variable influences, it can produce good results in some cases 
because of its ability to take particular local characteristics into account 
(sec 12lh U.S. (Census, Supplementary Analysis, p. 580, and discus¬ 
sions such as those of Iv I^'. \’'oung in American Journal of Sociology, 
XXXVITI, No. 4, and F. J. Kberle in J.A.S.A., XXXIlf, 694). 

C. Snow, also, has suggested the use of multiple correlation (see “'I'he 
Application of the Method of Multiple Correlation to the Kstimation of 
Post-Censid Populations,” j.R.S.S., LXXIV, 575), by means of which the 
increase of i»f)ulation between two censuses is expressed as a linear func¬ 
tion of two or three different variables such as (</.) the increase of the 
births during the JK-Tiod over those of the preceding inlercensal {x^riod, 
and the similar increase in the deaths, and in the marriages, or (Jti) the 
natural increase (i.e., births less deaths), and the increase in the number 
of inhabited houses, or (c) the increase in the inhabited houses and the in¬ 
crease in rateable values. While this method was shown to give some very 
good results, it is somewhat lengthy, and simi>lcr methods can generally 
be used. 

In the United States the Bureau of the Census, during many years of 
experimentation, has developed several procedures with the particular 
object of producing reliable estimates for State and local pf)pulations. 
Following early trials with a ratio method founded f>n the numbers of 
public utility consumers, city directories, voting registrations, and school 
censuses, the Bureau in 1936 published State figures based on natural in¬ 
crease with net migration estimated from comparisons of actual and ex- 
ix?ctcd school enrolments. For 1942 and 1943 the wartime registrations 
for ration books provided material f«>r Slate and county estimates which 
were very extensive and unrjucstionably accurate (see alscj the footnotes to 
the table in par. 7, and par. 11, on the use of similar ration-book data for 



80 Population Stalislics and Their Compilation 

the publication of national population figures in the United Kingdom and 
Eire). For 1946 and 1947 estimates were prepared following the Bureau’s 
development of two methods for computing net migration from school 
data:* 

(i) Tn the simj)1er method it is assumed that “the difference between 
the f)ercentage change in elementary school enrolment for the local area 
and the national percentage change in population of elementary school 
age is equal to the percentage change through net migration to or from the 
local area”; 

(f/) In a more elaborate process, which gave promise of improved re¬ 
sults, “net migration is measured on the basis of the difference between 
the population of elementary school age as estimated from the school data 
and the expected population of that age had there been no migration since 
the base date [and] the expected |X)pulation is computed by applying 
survival rates from an appropriate life table to the ix)pulation cohort at 
the base date that became the population of elementary school age at the 
estimate date.” 

In every method of estimating local populations of Sbites, counties, 
cities, towns, or other areas and particularly in attempting to make 
postcensal estimates for dales in the near or distant future it may also 
be necessary, of course, to make sfiecial allowances for factors likely to 
produce unusual disturbances, such as the establishment or disapix^arance 
of large plants or industries and other economic changes, or severe epi¬ 
demics, or physical or sociological developments in or near the area which 
might sharply alter the prosi)erity or attractiveness of the locality. In 
wartime, moreover, distinction must be maintained between the civilian 
population and those in the armed forces. 

59. When it is desired to compute estimates of population from the 
data of two censuses only by using some general hypothesis of population 
growth, the assumption of arithmetical progression has been employed 
widely, on account of its simplicity and because under many normal con¬ 
ditions it has been shown to give reasonable results (see the U.S. Census 
Bureau’s estimates which were based on this assumption for many years; 
T.A.S.A., XVIII, 26.S; J. S. 'Fhompson’s “Note on Mean Population,” 
T.A.S.A., XIX, 256; and some of the calculations in H. H. Wolfenden’s 
Canadian estimates, T.A.S.A., XXXV, 28.) 89). By this method, if in a 

*Sce “Population, Special Reports, P-47, No. 4,” “Current Population Reports, 
Population Estimates, P-25, No. 12,” and “Current Status of State and Local Popu¬ 
lation Estimates in the Census Bureau” by IF. S. Shryock and N. Lawrence, in J.A.S.A., 
XLIV, 157, where the procedures are summarized, and the necessary precautions re¬ 
specting the underlying assumptions and the precise school data to be used are dis¬ 
cussed fully. 



Esiimates of Populations 81 

given age (or other) group ti be the population enumerated at one census, 
IT, that at the next ..ensus n years later, and the intercensal pojmlation 
at time / after the first census, then 

»■( = ^’l +— (iTj — ITi) 

and the mean population over the « years, namely 

1 /■" 

becomes J (»ri + T 2 ). Als<i, calling the total )X)i>ulation at the first and 
second censuses Pi and P^ rcspHTtivcly, so that Pi = iJiri ami P^ — Sirs, 
it follows that the intercensal total population Pt may be calculated either 
from the group values as 

i-lTl =:sj Tl + ^llTa —TTl) j 

or directly from the totals as 

Pi+UPi-Pi^\ 


and the mean total fjcjpulation similarly will lie given either as the sum of 
the group means or independently as 



that is, J (/’i + l\), 

f)(l. 'riic arithmetical progression niethoil assumes a yearly increase of 
constant amount. Since, h<iwever, |M)pulation begets {xipulation it has 
often been suggested that a jircfcrable hy|Xithesis will be that of ^comclri- 
cul progression, under which a constant rale of increase (instead of a con¬ 
stant amount) is assumed. This principle is frec 4 uenlly a])plicable. as long 
as it is not used for the prediction of ]Xipulations many years hence, or in 
cases where the density of {Xipulation has become so great that a d(‘crcas- 
ing rate of increase is indicated. 

'riie rate «)f increase r is obtained from the sui)ixjsition that tto = rvi. 
Taking one year as the unit, the intercensal iwpulation vt is then on 
the assumption that the progression is ('ontinuous, and the mean {xjpuia- 
tion for the n years is 

-/ Tir'/**£//, 

nJo 


that is, 



or 


1^5 -1; 
log r 



82 


Population Statistics and Tlteir Compilation 

where X denotes Napierian and “log” the common logarithm, and k is the 
modulus. Or taking the n years as the unit the intcrccnsal {joimlation is 
r^ici and the mean population 

•'ll 

which gives the same result as before.* If the mean population were re¬ 
quired for a period not exactly coinciclcnt with the n years of the inter- 
censal period the limits of integration would be suitably modified (see, for 
example, J.I.A., XLll, 261). 

In calculating intercensal and mean [xipulations, r is generally taken 
as the ratio of the numbers in the same group at the two censuses n years 
apart; so that if denote the numbers at age x at the census in year s, r 
would be 



In predicting future ixipulations this same principle is usually extended 
by taking as rirj, that is, 



Persons aged x at one census, however, arc the survivors of the migration 
during the intercensal period and of those who were aged x—n at tlic 
census n years previously. Hardy and Wyatt, therefore, in preparing the 
1911 age-group estimates for the National Insurance Act and the Na¬ 
tional Health 'fables in the United Kingdom, allowed for this variation in 
age by assuming that the rates of mortality and the net rates of migration 
at various ages over the ycjirs s to s + « were similar to those of the jje- 
riod from z — to s, so that the ratio 


would be equivalent to 




*Cf. G. King, J.r.A., XLIl, 2f)0; C. 11. Wickens, XJAU, r>8; and J. S. 

Thompson, T.A.S.A., XIX, 258. On p. 257 of the last-mcntiuncd paper an approximate 
method is also sliown, by which the mean population over ii years is taken as 1/if times 
the sum of the successive yearly geometric means. 




Estimates of Populations 83 

whence = rVj where 


(see XLV, 411, and XT.VII, 553).* 

]f this method which may be callecl the true O.P. method -be applied 
to each afie-group separately, by taking r in each case as the rate of in¬ 
crease shown by the particular group under consideration, then the total 
intercensal population Pt or the mean f)opulation must be taken merely as 
the sum of the various age-group figures. An independent calculation 
from 7*1 and P^, on tlie assumption that P^ — RP^ lannot be made, be¬ 
cause in practice it is found that the rates of increase of the various groups 
and of the total ix)pulati(m are not the siime, with the result that it cannot 
be assumed that the total ix^pulation also follows its own (l.P.- for the 
sum of a number of (i.P.’s is not itself a G.P. 

61. 'fhis fact is not objectionable in some cases, as in the calculation of 
intercensal or mean jxipulations for certain fixed age-groups. There arc 
many instances in vital statistics, however, w'herc intercensal estimates 
have to be made for thousands of subdivisions in such a way that when 
estimates for certain subdivisions of the total |M>f)ulation, such as age- 
groups, have been made, these estimates must admit of further subdivi¬ 
sion, and tlie total iK)])ulation also must permit of subdivision into other 
groups, such as by localities, without disturbing eitlier the subdivisional 
estimates or the mean total population already calculated. Tlie “true 
G.P. metliod” does not fulfil this coii(!ition. Several methods have there¬ 
fore been suggested under which only the total (xjpulation is assumed to 
follow its own G.P. - the various groups being jussumed to jirogrcss in some 
manner (necessarily not so that each one follows its own G.P.) such that 
the total of the estimates of the variiius groups will actually produce the 
total population estimated, as assumcil, u|)on its own (i.P. 

(i) Of these methods, tlie siinjilcst may be called a modified geometrical 
progression. The ])roccdurc is to compute the A.P. value for each group 
and then to multiply each value by the constant ratio 

Total Population by G.I*. 

Total Poimlation by A.P. 

The total ix)pulation thus takes its G.P. value; and the group values are 
brought to an approximate G.P. basis, while retaining the proixjrtionatc 
distribution of A.P., in such a way that subsequent subdivisions may be 

• Xecessiiry flifKliUcalions were inlioclucccl lo allow for llie decrease in the rates of 
mortality of 1901-11 (s to s -f- n) in comparison with those of 1891-1901 (a — «to s). 



84 


Population Statistics and Their Compilation 


ciTccted easily without disturbing the original calculations. The method 
was used for a number of years by the Registrar-General of England and 
Wales for estimating local populations (see E. C. Snow, op, cit. in par. 58 
here, and 72nd Registrar-Generars Report, p. L\); it w’as suggested inde¬ 
pendently and examined by 11. II. Wolfenden in T.A.S.A., XX, 225; and 
it was employed for the official Canadian population estimates in the 
Annual Reix)rt on Vital Statistics, 1922 (see also 1921 Census of Canada, 
I, xliv). It may be noted that in tlie case of a stationary population group 
the formula gives a result below the stationary value—which, although a 
slight theoretical defect, is an error on the safe side (see 'f.A.S.A., XX, 
225). Til practice, the method frequently gives good results (see, for 
examjik'S, T.A.S.A., XX, 227, and XXXV, 285 88); and ---as there staled 
—“particularly where it is fouivl desirable to base the mean total popula¬ 
tion on G.P., while the various constituent groups show such divergent 
rales, as is usual, that neither A.P. nor G.P. can conlidcntly be assumed 
for them, this method would meet the rase, and it can be applied to any 
number of sub-groups with great facility.” 

(//) The Registrar-General, however, later adopted a method which 
may be called .1. C, Waters* First Method^ from tlie name of its author who 
publislicil it originally in J.R.S.S., LXIV, 29.^ (see also Part 1, Supyilemcnt 
to 05th Reixirt of the Registrar-General, p. c.xvii; J.I.A., XLTT, 265; 
T.A.S.A., XTX, 259; and R. Henderson’s “Mortality Laws and Statis¬ 
tics,” pp. 5.^-55). The condition that the sum of the group values shall 
e(|ual the total G.P. ixipulation is fulfdled by assuming that the ratio of 
each group to the total ]X)i)ulalion, viz., vt/Pt, increases in A.P. so that 
= (<i + bt) Pf, say (as in the method of par. 57). 'Phen, since the total 
fxipulation is founded upon G.P. so that Pf = P'Pi, it follows that 
Tt = P\ia -f ht) P*, and the mean population is therefore 






The values of a and h are determined from the fundamental assumption, 
which gives iri = aP\ and = {a + h) l\\ and hence the mean popula¬ 
tion of the group becomes 



and the sum of these values for the various groups is 


1 I.. (R 

\R\ ' V 




which is the geometric mean as required. 'Fhe formula, being of the form 
f(R)vi + ^tP)ira where f(R) and ^(P) de^xind only upon the total jxipu- 



Estimates of Populations 85 

lation, ensures consistency between estimates once made and subsequent 
subdivisions of those estimates; and the factors f{R) and 0(/?) when once 
calculated suflice for all the years and groupings in the interccnsal period. 

This method has been used in the construction of English Life Tables 
Nos. 6 and 7, the London Life Table, and by M. D. Grant in constructing 
his Canadian Life Tables (sec 'F.A.S.A., XXXV, 291). It has, however, a 
slight theoretical defect. The fundamental assumption upon which it is 
based, namely, that the ratio 

Age (iroup 
Total Population 


increases in A.P., while the total f3opulation increases in G.P., cannot 
contemplate a stationary population in any group—for with a stationary 
group b and the total p)pulation following a O.P. y = aR*, the ratio 


follows the curve 


Age Group 
Toial Population 


y 


J 

alP 


which is a G.P. and not an A.P. as assumed. The result is that it gives a 
value in excess of the census populations in the case where ti = irj (see 
A. C. Waters, 70th Annual Report of the Registrar-General, p. cxxxii, and 
II. II. Wolfenden, T.A.S.A., XX, 221). 

(ni) Waters consequently devised a second formula, which was in¬ 
tended to be free from the above defect and may be referred to as A. C. 
Waters' Second Method (see 70th Registrar-General’s Report, p. cxxxii; 
A. T. Traversi, J.R.S.S., LXXX, 84 and 529; II. H. Wolfenden, T.A.S.A., 
XX, 222; and J. W. Glover, U.S. Life Tables, p. 555). If we sup(x>se that 
TCi = fwiTi + nv 2 , one condition must be introduced to give m and n de¬ 
terminate values; and if this condition be that when the group tx)pulation 
is stationary, so that tti = then tti must alscj remain unchanged so that 
vt = iTi, we must have w + « = 1, so that ir* = win + (1 — m) t*. 
And since the sum of the group values must equal the total {population 
similarly estimated we must have 2iri, that is, m^in + (1 — w)Siri 
equal to Pt or mPi + (\ — m)P 2 y whence 


Consequently 


m 


Pit-Pi 



from which, assuming as before that the total {population follows its 



86 Population Statistics and Their Compilation 

G.P. with ratio the mean population becomes 

which for practical calculations may be written 

In this last form Travcrsi has given an alternative demonstration, which 
clearly shows the fundamental assumption that in, in increasing to ir 2 , 
changes in such a manner that the increase (ir 2 — iri) is supposed to follow 
the progression of the increment (P 2 — Pi)- Being again of the form 
/(/?)iri + 0 (P)ir 2 , it permits the subdivision of results, and is very easily 
applied. The formula was used in the Registrar-General’s Re|X)rts of 
1911-14, in the United States Life Tables, 1901 10, etc. (on p. 354 of 
which is a clear example of the method of taking the limits of integration 
in dealing with an intercensal period which is not an integral number of 
years), and on account of the simplicity of its application in determining 
certcain mean populations from the New Zealand data for 1911 IS and 
1916-20 (see L. S. Poldcn, “The Construction of Mortality Tables from 
National Statistics with Special Reference to Some Investigations Con¬ 
ducted in Respect of the Population of New Zealand, and a Comixirison of 
Mortality Rates between Australia and New Zealand,” Actuarial Society 
of Australasia, 1926, and T.A.S.A., XXXV, 291). However, it does not 
completely rectify the defect in Waters’ first formula, for it will give a 
mean population exceeding the A.M. (which is contrary to the principle 
that the A.M. is greater than the G.M.) in the case of a decreasing group 
when 

(see J.R.S.S., LXXX, 89, and T.A.S.A., XX, 223; and H. H. Wolfenden, 
T.A.S.A., XXXV, 283“91, for its application to the calculations of post- 
censal estimates). 

(iv) A, T. Traverses Method^ which was given in J.R.S.S., LXXX, 84 
and 529, removes entirely the theoretical blemishes in Waters’ two for¬ 
mulae by imposing, in ciTect, the condition that the mean population for 
each age group shall Bhways fall between the A.M. and the G.M. The 
excess of the A.M. over the G.M. for a group being 


2 ) '"V Xr "U2 24*^720 “) 



Estimates of Populations 87 

where 1 + i = r, Traversi scales down the A.M. for each group in propor¬ 
tion to iVi, the approximate difference between the A.M. and the G.M. 
llis method of application is to calculate first the A.M. for the whole 
population and for each group, and then to scale down the group values 
as follows: “(a) Ascertain Ihe difference between Ihe results from A.P. 
and G.P. for tlie population as a whole, i.e., 

(ft) Ascertain the value of tVi or 



for each age group; (f) Divide up (a) in projwrtioii to (ft); (r/) Deduct the 
result from the A.M. of each age group.” Some numerical results by this 
method are shown in T.A.S.A., XX, 227, for mean poimlations, and in 
T.A.S.A., XXXV, 283-91, for postcensal estimates. A practical dis¬ 
advantage for some purposes is that subsequent subdivisions or rearrange¬ 
ments of results once obtained cannot be made without disturbmg the 
original calculations. 

62. The assumption of G.P. is more defensible theoretically in the case 
of a population which increases mainly by an excess of births over deaths, 
while the change in a ix)pulation increasing by immigration may some¬ 
times be represented more nearly by an A.P. (cf. D. 1C. Kilgour, T.A.S.A., 
XX, 229). It has therefore been suggested that a Combined Progression 
Method might be used, by assuming that the increase each year is a con¬ 
stant proportion of the previous year’s i)opulation plus a constant num¬ 
ber. Thus with two census enumerations Pi and 7^2, u years apart, the 
lx)pulation one year after the first census would be, say, PiR -f- /; that 
after two years would be 

{l\R +/)/?+/ j ); 

and so on. The relation between the two census jxipulations is therefore 

A difficulty with this method, however, is that unless a third census Pz is 
also used similarly, in order to give two equations for the solution of R 
and /, the determination of R and I can only be effected approximately ■ 
as proposed, for example, by C. II. Wickens, J.I.A., XLIII, 62 (see also 
A. II. Mowbray, T.A.S.A., XX, 217). 




88 Population Statistics and Their Compilation 

The necessity of devising specuil methods for national {Dopulations 
affected by unusual waves of immigration is illustrated by the Canadian 
statistics from 1871 to 1921, for which close j)ostcensal estimates for 1911 
and 1921 resulted from the assumption that the decennial increase in 
population since 1901 could be measured as an A.P. plus the excess of the 
actual immigration over the “normal” average decennial immigration 
which had occurred from 1871 to 1901 (see H. 11. Wolfenden, T.A.S.A., 
XXXV, 290). 


(J) The Prediction of Future Populations 

6*?. Although the theory underlying tlie assumption of G.P. tliat 
population begets population—is often appropriate for the calculation of 
intercensal or mean jjopulations from the data of the two nearest cen¬ 
suses, it should be used only with great caution in any attem|)t to predict 
|.x>stccnsal {populations on account of its tendency to give figures which 
may be much too liigh (cf. the first {part of {par. 6() here, ajid sec II. II. 
Wolfenden, T.A.S.A., XXXV, 289, for a numerical illustration). In 
estimating {xpstcensal {pofpulations, moreover, it is clearly advisable to 
select some hyjpotlicsis of po])ulation growth which alms to take cognizance 
of the {particular features of the p(p{PuIation under examination, without 
arlpitrary assumptions concerning any sup{)oscd “law” of [PO{Pulation 
increment (cf. Wolfenden, op. ciV., p. 290). The prediction of reasonably 
defensible {xpstcensal estimates is essentially a {problem of curve-fitting or 
extra{x)lation, whether it is based on the two nearest censuses or on many 
more; and it must always be remembered (in the words of Henry Schultz, 
“The Standard Krror of a Foreciist from a ("urve,” J.A.S.A., XXV, 184, 
and as em{)hasized in “The Fundamental Principles of Mathematical 
Statistics,” {). .^20) that “there is no necessary relation between the good¬ 
ness of fit of a curve to {past observations and its reliability for forecasting 
{purposes; a curve may fit the data for the {past one hundred years with a 
high degree of accuracy, and yet fail to {predict the situation for the next 
year or so.” 

(i) Since the growth of {PO{Pulations under many circumstances is 
orderly and steady—without violent fluctuations-the methods of finite 
differences can sometimes be a{P{plied, or a parabolic curve can be fitted to 
the data by the method of moments or least s({uares. A third degree curve 
was tlius found by II. S. Pritchett (“On a Formula for Predicting the 
Po{pulation of the United States,” J.A.S.A., 11, 278) to give a sound repre¬ 
sentation of the U.S. census populations from 1790 to 1890 (although the 
long-range predictions based on that curve are much too large and afford 
a good illustration of the dangers of extrapolation), and A. L. Bowley in 



Estimates of Populations 89 

J.R.S.S., LXXXVIll, 77-79, gave least-scjuare fittings of second degree 
parabolas to the ix)pulations of iMigland and Wales and of France from 
1801 to 1911, and of a third degree curve for the United States from 1790 
to 1910. It was also shown b}" Wolfenden in "F.A.S.A., XXXV, 289-W, 
that the iiopulations of the Canadian Provinces could be estimated closely 
by such methods. Raymond Pearl and L. J. Reed (“On tlie Rate of 
Growth of the Population of the U.S. since 17‘X), and Its Mathematical 
Representation,” Proceedings of the National Academy of Sciences, VI, 
276) obtained improved results from the modified curve P/ = ti + 6/ + 
f/2 + //(log /). Bowley (in J.R.S.S., LXXXVIll, 77 -78), moreover, gave 
an excellent representation of the ]x>pulalion of Kngland and Wales from 
1801 to 1911 by means of (he probability integral (see “The Fundamental 
Principles of Mathematical Statistics,” p. 160, and compare F. B. Wil¬ 
son’s illustration of the growth in the number of ei)idcinic scarlet fever 
cases in the Proceedings of the National Academy of Sciences, XI, 451). 
In some instances it may be more satisfactory to deal with the decennial 
(or quiiKjuennial) rates of increase instead of with the actual ]M)()ulations. 

(//) Emj)hasis has already been laid u|X}n the imiK)rtanl restriction 
that curve-fitting methods of tliis kind arc satisfactory only for sliort-term 
predictions; when they are applied to long-term estimates they usually 
fail to allow sufficiently for (he slower rates of growth which frequently 
ojierate as (he |xi|)ulations become denser, so that they tend to over¬ 
estimate the ultimate |X)pulations. One useful method of giving effect to 
this feature was Dr. T. II. C. Stevenson’s determination of the decennial 
rates of growth of a particular community from those already ex[x?rienced 
under the same densities of (xjpulation by a similar community in which 
the development had already been practically completed (sec the Journal 
of Hygiene, PX)4, p. 207). Also, since the rates of growth and the densities 
of populations naturally a])proach limiting values, several analytical 
expressions have been suggesteil to re|)rescnl those conditions, 'rhus 
Knil)bs (in his “Mathematical Theory of Population,” ApiXMidix A, 1911 
Census of Australia, pp. 26, 42, and 55) has discussed forms such as 
or 4- bt'^ + c/'" + . . . , or .I/'V*'" (sec also H. L. Moore’s use 
of the last, with “ 1, as a “law of demand” in J.A.S.A., XVIII, 12); 
and CiomiHjrtz’s formula Ag'"' and Makehain’s ex])ression .1 + have 
been tried with indifferent success (sc*e R. B. Prescott, “Law of Growth in 
Forecasting Demand,” J.A.S.A., XVIU, 471; L. K. Peabody, “Growth 
Curves and Railway Traffic,” j.A.S.A., XIX, 476; and G. R. Davies, 
“The Growth Curve,” J.A.S.A., XXll, 374). 

{Hi) The most widely discussed curve-fitting method for predicting 
future |X)pulations is the Logistic curve, which was proposed first by 



90 


Population Statistics and Tlicir Compilation 


Verhulst in 1838 and was rediscovered independently by Pearl and Reed 
in 1920. The assumption that the pro{x>rtionate rate of increase over 
time, /, of a population, P/, growing in a restricted area, will tend to de¬ 
crease as the iwpulation becomes greater may be written in its simplest 
form as 


1^ 

Pi 



= nt — nPi , 


where m and n are constants; the solution of this differential equation 
gives 

p _ Ji _ 

‘ 1+Cc“*‘‘ 

This “logistic” curve starts from a lower asymptote, follows closely a 
G.P., reaches a iwiiit of inflexion, .and thence proceeds symmetrically to 
an upper asymptote. 

Convenient alternative forms are 

‘ ‘ i'+ ‘ ’ 


where r denotes the abscissa of the point of inflexion, A and B are the 
ordinates of the asym()totcs, and ^ is a constant, or 


B,= 


L 


a 


where L is the limiting ix)pulation P^, a delermincs the horizontal scale, 
and j3 is the lime from zero lo llie point of inllcxion, or 


where the scales are chosen so that L and a of the i)reccding (.‘xpression 
are unity and the jjoiiit of inllcxion is at zero time. 

Pearl and Reed also showe<l (in their papers on “'riie Mathematical 
Theory of Population Growth,” Metron, TIT, 12, and “The Summalioii of 
I..ogistic Curves,” J.R.S.S., XC, 729, and in Pearl’s “Studies in Human 
Riology”) that the essential symmetry of the preceiling expressions for a 
single logistic can be modified into a sinuous curve by summing com¬ 
ponent logistics, in order to represent successive cycles of growth (as 
illustrated in 11. Miihsam’s “Note on Migration and Verhulst’s Logistic 
Curve,” J.R.S.S., Cll, 445). The generalized form can be expressed as 

k 



Estimates of Populations 91 

In recent years the logistic curve has been fitted to a wide variety of 
populations in many countries, by several different methods (see “The 
Fundamental Principles of Mathematical Statistics,” pp. 321 and 327; 
for graphical methods of fitting see E. B. Wilson, “The Logistic or 
Autocatalytic Grid,” Proceedings of the National Academy of Sciences, 
XT, 451, W. A. Spurr and D. R. Arnold, “A Short-Cut Method of Fitting 
a Ixigistic Curve,” J.A.S.A., XLIII, 127, and E. A. Rasor, “ fhe Fitting 
of Logistic Curves by Means of a Nomograph,” J.A.S.A., XLIV, 548). 
Undoubtedly the formula is capable of representing many series of known 
past {xipulations with reasonable accuracy, and it can by summation of 
components or in its generalized form even be made to allow for sinuosi¬ 
ties; but it cannot li.ave much claim to be accepted as a “law” of growth, 
and its use for long-range predictions (in which the importance of the 
standard errors of the forecast values is to be emphasized) is surrounded 
inevitably by all the uncertainties and dangers of extrapolation (see 
Wolfenden, op, cit,, pp. 87-88, 160, 238,320 21, and the discussions there 
noted). 

(iv) W ithout attempting to make any mathematical assumption con¬ 
cerning the growth of the jKipulation, another procedure which can be 
applied when the necessary factors are available (or can be computed or 
selected with reasonable justification) is of course to project the various 
age, sex, or other groupings of the {population by means of survival ratios 
which usually would be taken from a suitable life table. 

Tn the notation of {par. 64, a known {xppulation PJ, aged .v last birthday 
acconling to a census taken at the beginning of a calendar year, would 
thus be {Projected, by the survival factor L^m/Lx from an a{P{)ropriatc life 
table, to give an estimate of the po{)ulation n years hence which w'ould 
then be aged x + n last birthday; or similarly the data can often be 
handled more ex|)editiously and with little loss of accuracy by multiplying 
the known {Po{Pulations in, say, 5-year age groups by factors 

7 x+n "" Px l-6+» 

In practice it may be necessary to adjust the survivors so computed to 
allow for the number and age distribution of the estimated net migrants 
during the {period of the prediction, while the technique can be elaborated 
on similar princijplcs in order to subdivide the forecast populations ac¬ 
cording to sex, race, nativity (i.e., native or foreign-born) or other groups. 
In recent a|){plications of these principles the calculations have usually 
been made by sex and race, in quinquennial age groups over successive 
5-year projection periods. It is also often important to make the projec- 



92 Population Statistics and Their Compilation 

tions by means of survival and other factors which take into account such 
future changes in mortality rales, etc., as may be anticipated within 
reason. 

'Fhe estimation of the populations n years hence at ages under w, which 
depend on the births during the n years of the prediction period, can be 
made (tf) from .assumed births during each of the n years, on the principle 
that each calendar year’s actual births would be multiplied by the 
life-table ratio Z,n-/.-i//o to reach a prediction of the population aged 
w — jfe — I last birthday (see, for example, Orcville’s numerical illustration 
in the U.S. Life Tables and Actuarial Tables, 1939-41, p. 23), or {h) by de¬ 
veloping fertility rates of women according to age, applying them to the 
appropriate female groups in order to estimate the births in the projection 
fx^riod, subdividing those births according to the sex ratio at birtli* based 
on previous data, and then projecting these estimated male and female 
births to obtain the survivors at the end of the {period. Method (6) has 
been adopted in the United States in all the recent calculations noted in 
the following paragraph. 

Various projections for (Ireat Britain (at intervals 5,15,30,60, and 100 
years from 1947) which were published in 1949 in the Re[X)rt of the 
Royal Commission on Pof>ulation have been derived by a somewhat 
different technique based on the view that the most important factors in 
fH)pulation growth (apart from heavy waves of migration) are the trend 
in the marriage rate and the number of children born per marrieil couple 
(i.e., the size of the resulting familU'S, as defined in the footnote to par. 

here, according to duration of marriage rather than by age). Both 
these factors are markedly dependent on economic conditions and threat¬ 
ened or actual wars, so that they fluctuate in a much sharper and more 
unpredictable manner than the relatively stable rates of mortality; they 
are, moreover, the underlying factors which determine the future numbers 
of births. Three scries of jirojections were com]>uted; all three assumed 
that mortality would continue to fall (over the ne.\t 30 years according to 
the trend of the decline during the last 50 years), and that marriage rates 
could be taken at the intermediate level of 1942-47; they differed, how¬ 
ever, in the hypotheses respecting future family size- the first supposing 
that family size would remain constant at the same level as among couples 
married in 1927 38, the second that it would be consbint at a level 6% 

* 'I'hc stability of the sex ratio at birth fur each race, regardless of such factors as 
order of birth, age of mother, economic conditions, etc., and regardless of war, is indi¬ 
cated in R. J. Myers’ papers “A Note on the Variance of Sex Ratios” (ITuman Biology, 
XV, 207), “KlTect of the War on the Sex Ratio at Birth” (American Sociological Review, 
XII, 4U), and ”War and Post-war Experience in Regard to the Sex Ratio at Birth in 
Various Countries” (Human Biology, XXI, 257). 



Esiimatcs of Populations 93 

higher than the first, and tlie third that it would fall progressively to 80% 
of the first. Final adjustments also were shown to illustrate the clTccts of 
(lifTcrcnt assumptions with respect to net migration. 

'Fhese general methods, which are essentially actuarial in nature, have 
!)een used on many occasions. Karly examples are those of A. L. Bowlcy 
in 1924 with resjject to “Jiirths and Populations of (Ireat Britain” (Jour¬ 
nal of tlic Royal Economic Society, XXXTV, 188) and his 1926 League of 
Nations “Estimates of the Working Population of Certain Countries in 
1931 and 1941.” Other ixipulation projections which have been made at 
various times by several investigators in Great Britain are examined 
briefly, with some descriptions of the techniques involved, by P. R. Cox 
in chapters 12 and 13 of the Institute of Actuaries* and Faculty of Ac¬ 
tuaries’ book on “Demography.” They have also been elaborated and 
a|)i)lied very extensively by A\ arren S. Hiompson and P. K. Whelpton (of 
the Scripps Foundation for Research in Population Problems) in resiHxrt 
of the ir.S. po])ulations,* by Frank W. Xotestein and others for various 
Jiuropcan countries,! by R. J. Myers in the prejxiration of U.S. f)opula- 
tion projections for social insurance cost estimates,! and by the Royal 

“'I’lio firsl f()ri.H.asls of 'riiom|>son and Whdpton were prepared for the National 
Resources IMariiiin^r Board in as **ICstiinatc.s of Future Population by States^*; 
in P)37 their “Population Statistics, yntional Data” gave six i)rojcctioiis on ditToring 
assumptions; and in 1943 (he National Resources Planning Board sponsored their re¬ 
vised and extended “Kstiinates of Future Populations of (he United Stales, 1940-2(KXJ,” 
in which twelve setsof estimates were given. In 1917 the Bureau of the Census jmhlished 
“Forecasts of the Population of the United States, 1945 1975” which was prepared by 
Whelpton alone. A new unit for the continuing olVicinl production of such estimates hiis 
now been created within the Bureau. 

t 'I’lic-se estimates were made in 1944 liy Nolestein and four meinhers of the Oflicc 
of Population Research of the League i»f Nations under the title “The Future Popu¬ 
lation of Fun)]u- .and the Soviet Union; Population Projections, P>4t)-1970,” with 
technical appendices and detailed mctlioflological notes (see also T..\.S.A., XLV, 4.S6). 

t See “Illustrative P.S. INipulation Projeclion.s, 1949” (.Vcluarial Study No. 24, Social 
Security Ailministration) liy R. J. .Myers, where estinuates of the LLS. pojndation by 
age grouf)S, for use in long range cost estiiiiate.s of the Old Age anil Survivtirs Insur¬ 
ance program, are deduced, by the iiiethiHls used by I'hompson .and Whel])lon, up to 
the year 2(151) «)n four basi's; (A) Low fertility, high mortality, no immigration; (B) high 
fertility, low mortality, and immigration of 100,000 per annum; ((') high fertility, 
high mortality, no immigration; and (D) medium fertility, low mortality, .and no 
immigration. 'I'he most recent estimates, prepared by R. J. Myers and K. .A. R.asor 
for the actuarial cost estimates of the old-age and survivors insurance system, are now 
the “Illustrative United Stales Population Projections, 19.52” (.\ctuarial Study No. 
Social Security Administration). .\n earlier discussion of the ])roblems involved in 
di?ciding upon the various factors required in this method is given in “Population, Birth, 
and Mortality Trends in the United .States” by R. J. Myers, T..V.S.A., XLI, 66. The 




94 Population Stalislks ami Their Compilation 

Commission on Population in Great Britain.* 

Because many different assumptions ix)ssessing varying degrees of 
plausibility must be made with regard to marriage rates, fertility, mor¬ 
tality, net migration, military losses, and other influences, such projec¬ 
tions are usually prepiired on several alternative bases, which often indi¬ 
cate ultimate conclusions with startling disparities. Their underlying un¬ 
certainties must therefore be clearly understood. The grave doubts, in¬ 
deed, which must be expressed with regard to the validity of long-range 
population estimates of this general character are well substantiated by 
H. F. Dorn’s jmper in J.A.S.A., XLV, 311, on “Pitfalls in Population 
Forecasts and Projections” in the United States, and are further em¬ 
phasized by the scepticism concerning the projections of the Royal Com¬ 
mission on Population in Great Britain which is recorded in J.I.A., 
LXXVT, 47 49. 

projections made by ibc Committee on Kconoinic .Security in 1934 may also be noted 
as summarized in “tssucs in Social .Security” (A Report to the Committee on Ways 
and Means of the House of Representatives, January 17,1946). 

* .See the Report of the Royal Commission on Population, 1949 (Cmd. 7695), and 
the discussions in J.I.A., LXXVI, .^8, ami J.R.S.S., CXIV, 38. 



VI 


THE MATHEMATICAL RELATIONSHIPS BETWEEN 
BIRTHS, DEATHS, AND POPULATIONS, AND THE 
FORMULAE FOR THE RATES OF MORTALITY 

64. When the population data and death records of an actual com¬ 
munity are used for the purpose of computing the fundamental probabil¬ 
ity of death, qx, in the year of age x to jc + 1 , it is essential to keep clearly 
in view the different calendar years of observation. In constructing the 
formulae depicting the relationships between births, deaths, and popula¬ 
tions in such an actual community, moreover, the assumption of a uniform 
distribution of deaths (with which the student will be familiar from his 
earlier knowledge of the conventional “life table”)* will not be made until 
the conditions justifying its introduction arc understood. In order thus to 
deal with these relationships involving both calendar years and years of 
age, and at the same time to examine the admissibility of the assumption 
of uniform distributions, it is necessary to employ a notation in which the 
required distinctions will be carefully maintained. The following symbols 
will therefore be employed in respect of the calendar year s.-f 

* It may be well here to emphasize again that qx is the basic function which is re¬ 
quired, and that the “life table,” as it is dcscriM in such textbooks as K. F. Spurgeon’s 
“Life Conlingencies” and as it is universally emj)loycd in actuarial w-ork, “is merely a 
convenient statistical device by which many complex formulae involving the funda¬ 
mental rjites of mortality, either without or with an associated rate of interest, may 
be translated from algebraical into arithmetical terms” (see If. 11. Wolfenden, T.A.S.A., 
XXXV, 281). The “life table” representsa purely hyi)Othelical community, subject to the 
qx of the actual community, in which a constant number of births, fu, (xxur each year 
and there is no immigration or emigration—si> that = dx/lx where, in life-table 
notation, L is the number attaining prexise age x and d, is the number dying in the 
year of age jr to x-f-1. In this hypothetical “life-table community,” as it may be 
called, the “number living” lx is equivalent to the sum of the deaths at age x and be¬ 
yond; and calendar years are not distinguished. 'I'lie “central death rate” is i»x ” dr//., 

where Lx is the population / L+zd/ in the year of age x to :r |- 1; and Lx =/* —dx/2 

•/fl 

in that portion of the table (usually at ages 5 and up) where the customary life-table 
assumption of a uniform distribution of deaths over the year of age can be made. 

t This notation is that given in If. If. Wolfenden’s [)aper “On the Determination of 
the Rates of Mortality at Infantile Ages, from Statistics of the General I'oimlation,” 
T.A.S.A., XXIV, 126, where it is employed as a simplification of that used by Professor 
J. W. Glover in the U.S. Idfe Tables, 1890, etc., pp. 329 et seq. The statements and 
demonstrations of all the formulae (1) (18) and (21)-(39) here are also taken from 
the same paper. 


95 



96 


Population Statistics and Their Compilation 

PI will denote the jxjpulation aged x last birthday (i.c., in the year of 
age xtox+ 1) at the beginning (i.e., on January 1) of the calendar year 
z; and 7^^+* will denote the ivipulation aged x last birthday (i.e., in the 
year of age .v to .r + 1) at the end (i.e., on December 31) of the calendar 
year s (being the beginning, i.e., January 1, of the calendar year 5+1). 

will denote the deaths aged x last birthday (i.e., in the year of age 
jr to :r + 1) in the calendar year 5. 

Pll will denote those who attain exact age during the calendar year s. 
Thus Pj Dj and R will be used for the data of the actual community; and 
they are analogous, resi^ectively, to the /-, r/, and / of the life table. The 
fundamental relation between them is 

+ ( 1 ) 

the rationale of which may be seen easily by taking a particular case, as 
follows: 'riie deaths 77.23*'*, say, in 1905 in the year of age 25 2() clearly arise 
from the births of 1879 and 1880. From the 1879 births the deaths occur 
between January 1, 1905, and attainment of age 20 in 1905, so that they 
number (T^J?/'^ —7t.2fl’^); from the 1880 births the deaths in the year of age 
25-26 in 1905 occur between attainment of age 25 in l‘X)5 and December 
31, 1<X)5 (i.e., January 1,1906), when they are variously aged from 25 to 
26, so that they number (7{.JX^*’—7^2r”)i these tw«) sections 

gives the total deaths as in tlic general formula (wlierc x — 25, and 
the calendar year z is the year c<immencing January 1, 1905). 

Tf we also denote by a77J the deaths (in the first bracket) wliich occur 
between the beginning of the calemlar year and attainment of age .v + 1, 
and by the deaths (second bracket) between attainment of age .v and 
the end of the calendar year, then 



(2) 

p 

II 

1 

1 


and 



(4) 

Also, from (2) and (3) it follows that 



(5) 


By the successive application of these funda.mental formulae it is then 
possible to relate the populations to the births from which they arise. For, 
applying (5) to the initial relation Ao = + a^u ('^) we get suc¬ 

cessively 

= 7 ^ 5 ' ‘ 

£S = * + a77S + * 

El = 7>5+3 + ^Dl + 
and so on. 


( 6 ) 



Mathematical Relationships hehveen Births, Deaths, and Populations 97 


These equations show the exact manner in which the births El may be de¬ 
rived by adding the appropriate deaths to the populations of later calen¬ 
dar years. Conversely, if the births are known, the populations may be 
computed therefrom by deducting the appropriate deaths, as here set 
out.* 

65. From (5) and (2) it is clear that are the deaths in the year of 
age A* to .r + 1 in the calendar year s, out of those born in the cjilendar 
year z — a*; while 5 /-)^, l)eing the lialancc of the deaths in the year of age 
X to .V + 1 in tlie calendar year z, were born in the preceding calendar 
year. Conseciuently, if we can obtain numerical values for the latter pro¬ 
portion + nD])^ which may be (lenoted by and therefore also 

for its complement (1 —/i), the calendar year’s deaths may be divided 
according to the two generations from which they arise. Thus 


and 




sDi 

Di' 



In some countries, such as (lermany and Austria, the deaths in each 
calendar year luive been registered according to calendar year of birth as 
well as by year (or m«)nth) of age at de.ath; and under such circumstances 
tabulations of tlic 5/9 and „D deaths, and thence /' and (1 —/J), follow 
directly (see, for examjile, (Hover’s U.S. Life 'rabies, p. 5.S9, and J.I.A., 
LIIl, 221). 

In (ireat Britain, Canada, the United States, and most other countries, 
however, such tabulations of deaths by calendar year of birth arc not 
generally available. An alternative |)rocedure then is to calculate/' from 
tabulations of the deaths by months of age. This, however, can usually be 
done only for the first year of age, and sometimes for the second, because 
the monthly deaths arc not usually recorded for the later years of age. For 
those first years/' can be found very closely by following the development 
of the calendar years’ deaths as given in par. 64. For the deaths in, say, 
1905 aged 0 jL- must arise from the births of December 1, 1904, to 
January 1, 1906; the deaths aged arise from the births of Novem¬ 

ber 1, 1904, to December 1, 1905; and so on until finally the deaths aged 
J.^-1 arise from the births of January 1, 1904, to February 1, 1905. 
Consequently in order to find, say, (1 —which, being a/7/(«/7 + 
aD), is the proportion of the calendar year’s deaths which arise from the 

*A diagrainmalii; represent at inn nf this firinciple, which may be useful to some 
students, is given (largely from Cssuber's Wahrscheinlichkeitsrcchnung) by Glover in 
the U.S. Life Tables, 1890, etc., pp. .W et scfi. 



98 Poptdaiion Statistics and Their Compilation 

births of the later of the two calendar years, we may take 

(!-/?“ (}!<!.+«<'. + «■'.+ • • • + A'*..) +S‘*.. 

fl-l 

where = deaths in the wth month of the year of age. Or, using areas 
instead of ordinates as in Ihe U.S. Life 1'ables, 1890, etc., p. 340, we should 
find similarly 

•“I 

A somewhat more precise method, which cannot be expressed readily 
by a formula, was used in the U.S. Life Tables and Actuarial Tables, 1939- 
41, pp. 117-18, wliere advantage was taken of tabulations of deaths under 
one year of age by age at death in months and also by calendar month 
of death. In most insttances it is possible from this information to de¬ 
termine whether the birth occurred in the calendar year of death or in the 
preceding year. Tn cases of doubt (e.g., for deaths occurring in March at 
the age of two months in completed months) it was assumed that one-half 
of such deaths arise from the births of each of the two calendar years in¬ 
volved. An exception was made in the case of deaths in January at ages 
under one month, where data were available for certain subdivisions of 
the first month of life and appropriate factors were accordingly applied to 
the figures for these various subdivisions. Hy these rules the deaths aDJ 
were estimated, and/J then followed. 

'J'he values of /' which have been found by these methods have shown 
significant changes during recent years. For example, in the U.S. Life 
Tables, 1890, 1901, 1910, and 1901 10, the values for all calendar years 
finally adopted by Glover were /J = .28 for males and .29 for females; 
fl = .41;/! = .47;/J = .48;/J = .48; and at higher ages/' = .5; and for the 
New York State data employed in T.A.S.A., XXTTT, 435, and XXTV, 126, 
the same values were used except /o = .3 as determined by Henderson from 
the 1909-11 N.Y. State data alone. Subsequent tabulations for the first 
year of age given by I. M. Moriyama and T. N. K. Grevillc (Bureau of the 
Census, Vital Statistics Si)ecial Repc^rts, XIX, No. 21) indicate that for 
the U.S. birth registration area, as a result of the large decrease in mor¬ 
tality in tlie later subdivisions of the first year of life, /J decreased fairly 
steadily from .263 in 1920 to .167 in 1942. Tn the U.S. Life Tables and 
Actuarial Tables, 1939-41, p. 118, tlie calculations produced/J = .207 for 
all males in 1935, with variations which led to the adoption of different 
values, lying between extremes of .162 and .348, by race and sex in each 
year from 1934 to 1941; and at ages 1 to 4, as no U.S. data were available 



Mathematical Relationships hchveen Births, Deaths, and Populations 99 

for their estimation, Glover’s values were again employed after they had 
been confirmed approximately by a rough Ihcoretical test (op. cit., p. 135). 
In order, therefore, to obtain actual U.S. values for use in the preparation 
of future tables, a 10% sample (amounting to slightly over 6,(XX) cases) of 
the deaths at ages under 5 in 1944,1945, and 1946 has since been tabulated 
according to year of birth, which indicated that, despite modenite ir¬ 
regularities, the values of /J at ages 1-4 have approached closer to .5 since 
Glover’s tables were constructed (sec “Investigation of Separation Fac¬ 
tors at Ages 1-4 iiased on 10% Mortality Sample,” Vital Statistics Spe¬ 
cial Reports, Federal Security Agency, XXX, No. 7). 

Some additional material for Non^'ay (1912-26 and 1934-35), Sweden 
(1915-45), and Denmark (1922—16) may be found in a paper by V. 
Valaoras on “Refined Rates for Infant and Childhood Mortality” (Popu¬ 
lation Studies, IV, 253). 

66. The factor /' divides the deaths of the year of age x to 

jr + 1 which occur in the calendar year 3 (cf. also pars. 69 and 76 here¬ 
after). In the development of the subsecjuent formulae it will be useful to 
employ also an analogous factor, k^, for dividing the deaths 
of the year of age .r to jc + 1 which emerge directly by following the move¬ 
ment of the [X)pulation over the year of age and thus involve two calendar 
years 3 and 3 -j- 1 (cf. pars. 68 and 75 hereafter); and wc shall therefore 
use kl to denote 

namely, the proix)rtion of the deaths of the year of age which occur be¬ 
tween the commencement of the calendar year 3+1 and attainment of 
age jr + 1 in that year. The numerical values of ArJ are not generally re¬ 
quired- -for /' is usually determined first, so that sD and „D result from 
them, and hence follows. The actual values, however, are very close to 
fl, since any variation between them is due solely to the variation in the 
volume of the data of the different calendar years; and in the life tabic, of 
course, kl is equal to fx (see 'f.A.S.A., XXIV, 141-42). For the first year 
of age, however, has been calculated approximately in the Su[)plement 
to the 75th Registrar-General’s Report, Part T, p. 5, and Part 11, p. xxi, 
by assuming uniform distributions, and hence taking as the proiX)rtion 
of the deaths 0 -1 in the calendar year which occur in the second six 
months of age, according to the formula 

( 1 - ifeo) = (</l + rfs+ ... + </«)-5-. 



100 


Population Statistics and Their Compilation 

Such values, however, will in general be too low, because dn decreases 
rapidly as n increases (see T.A.S.A., XXTV, 144). 


FormuL/\e for tiik Rates of Mortality 


67. In now considering the various formulae by which the nilc of 
mortality in the year of age a; to .v + t, may be obtained from data which 
involve both years of age and calendar years, it is essential to remember 
clearly the manner in which a i)opulation moves (</) over the year of age, 
and (6) over the calendar year. 

68. (a) The movement over the year of age is shown by the fact that 
the R*jc who enter upon the year of age xXox+ 1 during the calendar year 
z change, during (on the average) the latter part of that calendar year, by 
the occurrence of deaths to at the end; and that these ^ jjer- 
sons then change, during (on the average) the first portion of the next 
calendar year s + Ij by the occurrence of the deaths to the /i^{ 
who attain the end of the year of age x to x 1. Conseciuently 


7x = 


(Kl-P;, 


F., " f: 


( 8 )* 


Fr; 


(9) 


This may also be written 


1 - 


/,-i, 

FI 


F' ‘ I 

• 'X I 1 

•py+i 


so that if we write ap^ for the probability, lliat a person attaining 

age X during tlie calendar year z will survive over (on the average) the 
latter part of the calendar year to the end of that year, and ‘ for the 
probability, that a person in the year of age .v to .v + 1 at the 

beginning of the calendar year s + I will survive over (on the average) the 
earlier part of the calendar year until attainment of age .v + 1 during that 
year, then 

I (10) 


These formulae (8)-(10) will be called the “Type I formulae” follow¬ 
ing T.A.S.A., XXIIT, 271. Tt will be noted that the qx so found is deter¬ 
mined from the data of two calemlar years, and therefore should strictly 
be written, say, The resulting values, consequently, as shown 

numerically in 'fable H, 'F.A.S.A., XXIV, 154, are a blend of the rates of 
those two years. 

* The numbering of the formulae used in T.A.S.A., XXIV, 139 cl slh}., is retained 
here fur ease of reference. 



Mathematical Relationships between Births^ Deaths^ a}id Populations 101 
From the definitions Uiat 

it will be seen also, since ^ = 1 — from (.?) that 
and from (2) that 




/ij- 


69. ifi) The movement over the calendar year, on the other hand, is 
shown by the fact that the population PJ at the beginning of the calendar 
year changes during (on the average) the first i^^rtion of that calendar 
year by the occurrence of deaths to the who attain the end of the 
year of age x to .r + 1 during the year; and that they are replaced by iij 
who enter the year of age .v to x + 1 during the year and who during (on 
the average) the second tx)rtion of the calendar year, by the occurrence 
of deaths are reduced to the population at the end. 

Hence, on account of the replacement of jK^i by during the calendar 
year, formulae analogous to (8) and (9) cannot be written down; but, us¬ 
ing ratios, we have immediately 


* X 

(11) 

= 1 

(12) 


These are the formulae given by R. Henderson in T.A.S-.-X., XXTIT, 437; 
and, as in T.A.S.A., XXIIT, 272, they will be referred to as the “'I yi)c HI 
formulae.”* They give the true mortality, of the year of age x to .c + 1 
in the calendar year s.t 

* The “Tyjie 11” formula for q, which is inclicalccl in Glover’s U.S. Life Tables, p. 
334 (see also T.A.S.A., XXlll, 271), is simply 

P' *■ j 

1 - y,.- I or 1 — 


and is obtained by folhnving Pi to I\i^\ and thence directly l<> T/+1. It is not generally 
useful, however- for it does not give the rate of mortality of an exact year of age, but 
follows instead the year from the fractional age at the beginning of a calenriar year 
to the fractional age at the end. .Since, however, it involves the pjipulations only, it may 
be employed to give a rate of mortality of the abrive character from census returns 
alone, as in par. 83 here. 

fThe following analysis given by II. IT. Wolfendcn in T.i\..S.;\., XXIV, 165, may 
also be of assistance: “The Tyjie III formula, in order to derive the rate of mortality 




1U2 


Population Statistics and Their Compilation 


70. The difference between the T 3 rpe I formula (10) and the Type III 
formula (12) is merely that (10) is changed by substituting for the 
equivalent If the mortality shows no variation over successive calen¬ 
dar years, then and the Type I and Type TII values will be 

identical. Alternatively, if, upon the same supposition that there is no 
variation over succcs.sive calendar years, in the Type I formula is re¬ 
placed by its equivalent then any of the Type I formulae, which 
give will clearly be changed so that they will give 
Thus, from (9), 


ufAH 




( 1 . 1 ) 


But, when it follows, writing for 1 — ^p, that 


whence 


= - -- 




jyw , « jy^ * 

“ ifei£i+ (1 - “ .z?;-h7/)j+' • 


(14) 

(15) 

(Ih) 


Tlicsc are the true formulae for the rate of mortalily in terms of those who 
attain age x during the calendar year. 


fn)m ihc (lata of llu* single cak'ndar year 5, instead of from the two calendar years s and 
5 -f- 1 fas in 'r>n)e I), assumes that may lx; compounded with instead of with 
*■*. The cons(;quence of this is that ^5, being involves, by definition, a sub¬ 

stitution of one group for another -because (remembering that the objective is to ob¬ 
tain the rate of survival, or mortality, over the year of age, from the data of only one 
calendar year, notw'ithstanding the fact that a year of age necessarily involves, on the 
average, two calendar years) deals with a liody of lives of whom P %' ^ are surviving 
at the end of the period to which it relates, while 6^ is derived from a bod}' of lives 
which starts as i" instead of the preceding ’ *. lliis substitution of P*x for both 
of which are populations in the year of age x to x + implicitly assumes that 7^ and 
P*x^^ arc of the same age constitution, to the extent that the probabilities 




to which they give rise may }yc assumed to be identical. This is the fundamental basis of 
all the Type III formulae; and the T>'i)e III probaliility ^ may therefore be defined 
as the true (Tyi)c 1) probability of (x) surviving to age x + If modified by the con¬ 
venient supposition that == spi, that is, that there is no variation over successive 
calendar years.” 



Mathematical Relationships between Births, Deaths, and Poptdations 103 


Since is veiy nearly equal to /, (see par. 66), formula (16) may also 
be written approximately as 


9 


*+i — 

X 


/JEJ+ (1-7,)EJ+^‘ 


(17) 


And this formula has frequently been modified further by assuming uni¬ 
form progression in the movement of the population over the year of age 
(see par. 75 here, and T.A.S.A., XXIV, 142) so that ^7}* = that is, 
K = 2 > (16) becomes merely 






(18) 


Another formula (published first by T. M. Moriyama and T. N. E. 
Grcville see jiar. 79 here) in terms of EJ, Ei-*-’, ami but using 
instead of as in (16), is obtainable easily by writing the fundamental 
'J'ype III relation (12) as 

= 1 « (1 1 ) = + . 

'fhen from [)ar. 68, 



by formula (25) given 

which from par. 65 is 
Hence we find 


hereafter this is 





Ei‘*' Ei 


(18fl) 


71. Formulae (15)--(18<7) for |>roceed ujxm principle {b) of par. 69, 
by which the calendar years’ deaths arc related to the births from which 
they arise. Analogous formulae, however, may also be found which, fol¬ 
lowing principle («), par. 68, start from the births of the calendar year and 
trace the deaths of the two calendar years to which they give rise. Thus 
from (9), 




-Kf) 




Ei 


Ei 


( 21 ) 



1()4 Popidalion Statistics and Their Compilation 

which becomes, when /J is the same for all values of a, 

Vx -- , 


and when /, is taken as J, 




Ei 


( 22 ) 

(23) 


These formulae still give the Type T mortality whereas the analo¬ 
gous Nos, (15)-(18<i) give, as is more desirable, the Type III mortality, 
of the calendar year. Formulae (15) -(18fl) arc therefore preferable; 
and consequently they are generally used in practice. 

72. All the preceding formulae express q in terms of E. Where, however, 
as is usual for ages above 5, the census populations are sulTiciently reliable, 
it is desirable to deduce q directly from those populations and the deaths 
of the calendar years. That is, a Tyf3e III formula is required in terms of 
P and D only, and is obtainable as follows: Substituting for El 

according to (3) in the denominator of (13) we find 


- 








- (24) 


In order to transform tins to a Tyjic III formula wc take, as in ptir. 70, 
that is, 

E‘r ■’ 


and from this and (14) it follows that 

Ei a~D‘'' 

by which (24) becomes 

x+i 

9' ‘=pj+hF': 1 - k’) iP'+*Vpr ‘ +”z?r+‘] 


(25) 

(26) 






(27) 


'Phis is the generalization of the usual formula for obtaining q by dividing 
the deaths by the mean population plus half the deaths. 



Mathematical Relationships between Births, Deaths, and Popidatiom lOS 


73. Where R and P arc both available, (27) may also, by equation (1), 
be written in the convenient form 






(28) 


74. Tn order now to obtain a formula in terms of E and the increase of 
population say, as is sometimes required (see pars. 79 

and 92), the Type I formula (13) may at once be transformed by (25) into 







(29) 


This formula may alsf3 be modified slightly by writing 


by formula (25). 



for 



Formulae for the Rates of Mortality on the Assumption of 
Uniform Distributions* 

75. The formulae of pars. 67-74 have all been obtained without the 
introduction of any assumption other than that q* = q^^; and they arc 
therefore suitable for the calculation of the rates of mortality at infantile 
ages, where fl and k]. are not equal to ^ and the assumption of uniform 
distributions is consequently inadmissible. For ages above 5, however, 
that assumption may be em])loyed; and the manner in which it is intro¬ 
duced, and tlie modifications which result in the preceding expressions, 
may therefore now be considered. 

Tn par. 68 it was explained how the population moves over the year of 
age x; to x + 1 by ii' chsinging to P^^^ by the deaths and then again 
changing from 7’^+^ to by the deaths Tn order to maintain a 
uniform distribution of the deaths over the year of age, will therefore 

* The method used here for stating the assumption of uniform dislriljution over the 
year of age in the 'r 3 rpc I ])Oj)ulation hy means of equations (30) -(32), and the assumj)- 
tions of uniform movement in the Type III population by the relations (34)- (35) over 
the year of age and (37) (38) over the calendar year, was published originally, together 
with the greatly simjditicd proofs to which the method leads (as shown in pars. 7.S-77 
here), by IT. II. VVolfenden in the paper “On the Determination of the Rates of Mor¬ 
tality at Infantile Ages, from Statistics of the General Foimlation” (T.A.S.A., XXIV, 
146-49) previously mentioned in the footnote to par. 61. 

In addition to those demonstrations, the student may be referred to the alternative 
series of even shorter parallel proofs given in the same paper on i)p. 155-58 (])ortions 
of which are reproduced here), and also to the further comments on pp. 16.S-66 thereof. 



106 

Population Statistics and Their Compilation 


equal 

This may also be put as 



(^j - /«'*) = (/>j^‘ - /Jit!), 

(30) 

that is, 


(31) 

which is 


(32) 


This last c({uation therefore states, in terms of probabilities, the assump¬ 
tion of a uniform distribution of deaths over the year of age, in a {xjpula- 
tion of 'Fype I. The same eciuation is also seen to be true in the ideal Type 
I ptjjmlation of the life-table; for with uniformly distributed deaths, 
(/x — /xij) = (fxi4 “ fx+i)» and this, as above, may be written |j^x = 
—that is, the probability of (a-) dying in the first half of the 
year of age equals the probability of his dying in the second half. 

Applying, therefore, the assumption of uniform distribution of deaths 
as embodied in equations (30) and (32) to the fundamental 'lype 1 for¬ 
mula (8), the latter at once becomes 


_ m-FAVi ^ _ aDi+_BDi^^ 

' . ’+j Pi n + ■ 


This formula is analogous to the usual 



of the life table, and to the Type III formulae (27)-(28) and (36). 

Another new type of proof for (33) was also given by IT. H. Wolfenden 
in T.A.S.A., XXIV, 155-56, based on the Lemma that (as may be seen by 

puttinR =.® = ft) if = I, then 

a+b • 

For the Type I formula is, as previously stated, 

( 10 ) 


Introducing the condition of uniformity (32) by using it to eliminate 
from (10) we get = 2 J, and eliminating similarly we ob¬ 


tain 






Mallitmatical Relationships bchvccn Births, Deaths, and Populations 107 
From these two relations, therefore, 




< 11 




where 


_ 

Ei 2P;+‘ -/?+{ 

= 1+L 

h 

X = ./?i, a = i’i, I' = and 6 = 1 


(A) 


Now I lie uniformity condition when expressed as (30) means that 
aD]^ = or here A" = I', 


and from (30) also ii; = 27”** — 7{*|}, that is, a = h. Consequently, 
and this relation and (A), by the Lemma, immediately produce 

] 0 


(.L?). 

76. Coming now lo the movement of the ixipulations over Ihe calendar 
year, as required in the formulae of Tyyie TIT, it was shown in par. 69 how, 
in the calendar year s lo s + 1 , PJ passes to /ij, j, and is replaced by 
which passes to PJ'*’* at the end. Applying the above principles, the move¬ 
ment may therefore be followed, in such ixjpulations, either (i) over the 
year of age, or («) over the calendar year. 

(i) In tracing the year of age (in the calendar year) we have to follow 
to P'^S which is then replaced by 7” and passes to j; and therefore, 
in order for P' to pass uniformly to P', j as required, we must have 


that is. 


(P;- 7”**)P;- P:*V^J-7t:;n). 


(34) 

(35) 


This erjuation, expressive of uniform movement over the year of age in the 
'lypc ITT population, is, of course, obtainable also from the condition (32) 
for uniform movement over the year of age in the Type 1 |K)|)ulation, by 
substituting 5 ( 7 ; for as in the case of the similar transformations from 
Ty[)e I to Type HI ])reviously discussed in par. 70. 

The introduction of condition (34)-(35) into the Ty^ie 111 formula (11) 
will therefore produce the true calendar year formula for the particular 
case of uniform movement over the year of age the resulting formula 
i»h,g ^ 

1 . 

7 T 9 


(36) 



108 


Poptdation Statistics and T/tdr Compilation 


This may be shown as follows: 

Condition (34) is from which 


Hence (11), or 
which may be put as 


P' 


EiPi ' ’ 


/>' 11 

. Ri 


becomes 


But 



ly. 




_ 

“ 2 ” ) 

“P" (l+»</I)from (35), 

' X a^Px 



Px * Pi+P^^^' 


by means of which (36) emerges at once. 

Wolfenden’s alternative type of proof given in par. 75, based on the 
Lemma there stated, also produces formula (36) easily. For (as pointed 
out in T.A.S.A., XXIV, 157), from the Tyj)e Ill fonnula (12) and the 
uniformity condition (35) it follows, by exactly the same method as be¬ 
fore, that then 


ci'-F - 


where 



A' = ^Dl, a = RU Y = and b=2Pl- ZilJ+i. 


(C) 


[In the original paper in T.A.S.A., XXIV, 157, a transposition in these 
expressions for a and b occurred in formula (3e), and the first two portions 
of formula (3i) should be inverted.] 

Also, as before, from the uniformity condition when written in the ex¬ 
tended form (34) we see that 

K” FJ * 




or 



Mathematical Relationships between Births, Deaths, and Populations 109 


while again from (H) 

A'7>- /*' ' 


or ^=4, 


jH' 


Hence y == ^, and from (C) and the Lemma (36) follows directly. 

77. (//) Tf we now suiijjcjse, instead of the above uniform movement 
over the year of age, that the jxjpulation is to show a uniform i>rogression 
over the calendar year, we follow /’i to JiUi. which is then replaced by ii' 
and i)asses to so that we must have 


that is, 




jfilx " ■ hPx^a^x • 


(37) 

(38) 


This last e(|uation ex])resses the condition that there is an equal i)robabil- 
ily that a ixTson living at the beginning of the calendar year will die in 
the first |K)rlion or the second portion of that calendar year, just as equa¬ 
tion (35) and similarly ^32) - expressed the condition of cciual proba¬ 
bility of «lcath in the first and second fwrtions of the year of age. 

Consecjuenlly, introducing this condition (37) (38) into the Type HI 
formula (11) we obtain the true calendar year formula for the particular 
case of uniform movement over the calendar year the formula being 


Dx 

which may be shown thus: 

From (37), J)\Ffx\\ = hDUi\, whence 


(39) 


7 X 


7ir i 1 

F'x -| Iv, I i 


Hence (11), as in tlie preceding {xiragraph, becomes 

Dr 

,, I, I ’ 

H JixP^x+ExMPV'\ 


and this, by (37), at cmcc reduces to the required form (39). Or, in exactly 
the same manner, (39) may be obtained by introducing the condition of 
uniform movement into the generalized formula (29). 

The alternative ty[)e of proof based on the Lemma stated in par. 75 
again leads to formula (39) in a very simple manner, as was shown by 
Wolfenden in T.A.S.A., XXIV, 158.* For, by the method used in similarly 

• Hoth the proofs given in this paragraph (which, as already remarked in the foot¬ 
note to par. 7.S, dcfiend on the fact that (3X) can he used to state the uniforniity assunijj- 



110 Population Statistics and Their Compilation 

reaching (33) and (36), tlie introduction of the condition of uniformity 
(38) into the T 3 rpc III formula (12) leads to 


= ■^ + -^ where A" = J)]. ,a=Ply y=aP %, and b= *. 


(D) 


Also, the uniformity condition (37) slates that 

being 

while again from (37) PlEi == i2*|.i(2£* — or 

6 Ei' 

Hence ^ = ti and from (D) and the Lemma (39) can be written down 

i 0 

immediately. 


I'^iE Practical Application of the Pricceding Formul^ve 

78. In considering the practical applicability of tlie [)rcrcding formulae 
it is necessary to dislinguisli carefully between the “infantile” ages and 
the liiglier ages. Hic former may usually be treated as comprising the five 
ages 0-4 last birthday, inclusive, where /j is not eriual to ^ and conse¬ 
quently uniform distributions cannot be assumed (see par. 6.S). 

In practice, /J must be chosen carefully. At each age from 1 to 4 in 
modern life-table cfinstructions, which involve only a few calendar years, 
/j can usually be taken without variation in respect of z, Moreover, at 
ages 2, 3, ainl 4 (and sometimes, but not always, at age 1) satisfactory 
values of q will generally emerge if /, is given the approximate value J- 
(cf. H. II. Wolfenden, 'f.A.S.A., XXV, 149-52, and the Vital Stiitistics 
Special Rei)ort on an “Investigation of Separation Factors at Ages 1-4 
Based on Mortality Sample” noted in par. 65 here). For ages 5 and 
beyond /; = 5 may also be employed for all practical purjxjses, so that at 
those ages the assumption of uniform distributions can be made. 

79. The Type HI formula (11) (12), therefore, should be used at the 
infantile ages whenever possible—for it gives the theoretically true q\ 
directly without the introduction of any subordinate assumptions. The 


lion in terms of probahililics) are very much easier and quicker than the lengthy 
demonstrations involving dellnite integrals and dilTercntial equations which had been 
pulilished previously in Czulier’s “Wahrscheinlichkcitsrechnung” and elsewhere in 
European literature—see T.A.S.A., XXIV, 149. 



Mathematical Relationships bchvccn Births, Deaths^ and Populations 111 

Tyi)e I formula (8)-(9), and its modifications (21)-(23) and (33), and the 
Type TT formula (footnote, par. 69), are not so desirable -the former in¬ 
volving the two calendar years z and 5+1, while Type II does not follow 
an integral year of age. 

Of the various modifications of the 'ryi>e III formula. Nos. (15)-(16), 
which by the use of find 7 ;** from the births and those who attain each 
subsecjuent age .v, naturally give sounrl results because no arbitrary as¬ 
sumption is involved; in practice, however, the necessity of determining 
kl is not convenient. No. (17) consequently has often been used (as stated, 
for example, by the Registrar-tleneral of Kngland and Wales, 83rd Annual 
Report, 1920, xxviii) because it employs the more accessible /, and 
usually provides a close a|)|)roximation (see the numerical illustrations 
given by II. II. Wolfenden in T.A.S.A., XXIV, 154; and 1. AI. Moriyama 
and 'r. .N. K. (ireville, “lOfTect of Changing Birth Rates upon Infant Mor¬ 
tality Rates,” Bureau of the Census, Vital Statistics Special Reports, 
XIX, No. 21, p. 409). No. (18), with its assumiitiun of uniform movement 
over the year of age, will produce reasonable results when a number of 
calendar years arc employe*! in which the births and ileaths arc fairly 
uniform (sec 'f.A.S.A., XXIV, 151 and 154); it has formed the basis of the 
methods employed in the lOnglish Life Tables Nos. 1 to 8 at infantile ages 
(see pars. 87 et se([. here). No. (18<i) was first [uiblished by Aloriyama and 
(Ireville {op, rit.) as a convenient formula requiring/' instead of i’i; being 
derived without any subordinate assumption from the fundamental Type 
III, it necessarily j)ru<luces sound results; and it is used in the United 
States by the Naticmal Ollicc of Vital Statistics for their annual computa¬ 
tions of infant mortality rates.* 

Formulae (26) (28) may sometimes be useful at the infantile ages (see, 
for cxam|)le, Henderson’s “MortaliU" Laws and Statistics,” |). 96), so long 
as the j)oj)ulations used therein may be assumed to be reliable (see i)ar. 92 
here). 

Formula (29), with the factor in the denominator taken as 

hiE} 

/>V| I > 

has been useil in the U.S. Abridged Jafe 1 aides, 1919 20 , for the first year 
of age; and, as shown in T.A.S.A., XXV, 149-52, it may be applied readily 
for the other infantile ages as well—for the above factor may be taken 

* In reports on vital statistics in sonic c*)untrios a rouj;li approximation is often 
made hy calculating merely the r.Hi«» of deaths in the first year of age during a cal 
endar year to the numher of children Inirn during the year, i.c., Z^//io. That method, 
however, as will be seen at once from formulae (15) (ISuj and (21)-(23), may give 
misleading results under the usual circumstances of changing birth and death rates. 




112 Population Statistics and Their Compilation 

cither from known values or from the consistent tables of PJ, TJJ, and 
/J which may be built up as in par. 92 hereafter. 

80. Of the formulae which result from the assumptions of uniform dis¬ 
tributions, (36)---or occasionally the less usual form (33)—is generally 
used as the basis of the calculations at all ages except the infantile ages, 
where it overstates the mortality seriously. It is commonly applied by 
using the corresj^nding form for the central death rate, w* —the ratio of 
tlie deaths during the year to the population in the middle of the year. 

Formula (39) also results in an overstatement of the mortality at the 
infantile ages if it is applied directly. This is shown in T.A.S.A., XXIV, 
150-54, where the numerical results of Jill these formulae are compared, 
and also in T.A.S.A., XXV, 149-52. At those ages, therefore, it should be 
used cither in its general form (29) (in which case uniform distributions 
are not assumed), or it should be applied over shorter age periods than an 
entire year of age—as was actually done by Cdovcr in the U.S. Life Tables, 
1890, etc. (p. 34i, par. 112), where for the first year of age the formula 
was applied to each sei)arate month of age. An extension of formula (39), 
with the denominator taken as when the 

net migration at age x last birthday, {XAf)xi is sullicicnt to affect the 
results, is examined by T. N. K, Grcvillc in R.A.I.A., XXXI, 368-73, 
and is there related to the corresponding formula by which the “exposed 
to risk” can be computed from the individual records of insured lives (see 
Greville, loc. ciL, “Census Methods of Constructing Mortality 1'ablcs 
and Their Relation to Insurance Methods,” and 11. If. Wolfenden, 
1\A.S.A., XLTTl, 258, “On tlie Formulae for Calculating the ‘Fxposed to 
Risk’ in Constructing Mortality and Other Tables from the Individual 
Records of Insured Lives,” in which formula (31) is the basis of that em¬ 
ployed by Greville). 



VII 


TIIK 


(mSl'RUCTlON OK MORTALrrV IWBLES 
KROAI POPULATION STATISTICS 


81. With the formulae of the preceding section at hand, the methods 
of constructing mortality tables from iX)pulation statistics may now be 
considered in detail, 'rhey fall into three divisions, according as the sta¬ 
tistics employed are (1) Death Rclitnis only; (2) Census Returns only: or 
(3) Death and C 'ensus Returns supplemented frequently by Birth Returns at 
the Infantile Ages. 

(/) Construction of Mortality Tables from Death Returns Only 

82. In the case of the hypothetical stationary community of the life 
table, a mortality tabic could he formed by recording the deaths, rfx, in 
each year of age, summing them to obtain /j, and thence computing 
V-r = dr/lx- It is clear, liowever, that this princii>lc will be disturbed 
if the number of annual births varies, or if there is any immigration or 
emigration; and since varying birth rates and migrations arc found to 
exist in all actual communities, it wouhl be necessaty to introduce cor¬ 
rections* for thf)se variations in order for the |)rinciple to be aj>i)Iicablc 
in practice. Such correclions, however, cannot be determined accurately, 
because the variations in birlh rates and migrations are themselves inter¬ 
woven, and are of a lluctuating character, while the juigralion statistics 
are usually defect ive. 

'IVo historically important examples of the application of this method 
are Halley’s llreslau 'Fable, which was constructed in 1693 from the deaths 
in the city of Breslau, (lermany, in the years 1087 91 (see J.I.A., I, 42, 
and XVIII, 251, and Henderson’s “Mortality Laws and Statistics,” p. 2), 
and Dr. Richard Frice’s Northamptfui Tables which were ])iibli.shed in 
1771 and 1783 from the deaths in the Parish of All Saints, Northampton, 
England (see J.l.A., XVIII, 107). Both these early writers were cog¬ 
nizant of the necessity for the |)opuIation being statmnary, in order for 

* The nature of the requisite correcti«»ii in the case of a pojiulation with an increasing 
niiinber of annual liirths is shown on p. (» of (Icorge King's Institute of Actuaries' 
Text Book, Part 11 (whicli may still he consulted fur the clearness of its [)resentation) 
and the possible cITcct of migration upon the jirinciplc of summing the d column to get 
the number of living, /, is illustrated by an exaini)le in Spurgeon's lafe (*r)ntingcncies 
(with which the actuarial student may be assumed to have become familiar in his early 
reading). 


113 



114 Population Statistics and Titeir Compilation 

the method to be strictly applicable. Halley remarked that “the method 
requires, if it were possible, that the People we treat of should not at all 
be changed, but die where they were born, without any Adventitious 
Increase from Abroad, or Decay by Migration elscwliere,” and that this 
condition “seems in a great measure to be satisfied by the late curious 
Tables of the Bills of Mortality at the City of Breslaw”; while he further 
commented that “in the Five years mentioned .. . there were born 6,193 
Persons, and buried 5,869, that is born per annum 1,238, and buried 
1,174, whence an Encrease of the People may be argued of 64 per annum 
. .. ; but this being contingent, and the Births certain, I will suppose 
the People of Breslaw to be encreased by 1,238 Births annually” Qoc. ciL). 
Dr. Price, on the other hand, had to deal with an excess (469) of burials 
over christenings, for which he corrected by assuming an immigration 
into Northampton at age 20. Although both the Breslau and Northamp¬ 
ton tables were defective by reason of the arbitrary assumptions wliich 
were made in overcoming the discrepancies between the births and total 
deaths, they are important as being the first complete mortality tables 
published, and as examples of the difficulties encountered in attempting 
to construct such tables in the absence of census returns.* 

* Amc Fisher, in a paper in Proc. Casualty Actuarial Society, Vol. IV, and a l)Ouk 
entitled “An Elementary Treatise on Frequency Curves, and Their Application in the 
Anab'sis of Death Curves and l^ifc Tables,” has claimed that mortality tables can be 
constructed from deaths alone by (/) classifying the deaths, by age groups, according 
to certain groups of causes of death; (Ji) calculating the proportionate death ratios 
(i.e., the proportion of deaths for each cause-gnmp to the total deaths from all causes, 
for each age group) which depend on the deaths alone and arc absolutely indq)endent 
of the numbers exf)osed to risk; (Hi) expressing these ratios as Charlier frequency 
functiona—the groups in (i) being, in fact, so chosen that these ratios conform to 
Charlier curves “whose parameters are known or chosen beforehand”; and then (iv) 
assuming that we can pass from the fluctuating actual community to the stationary 
hypothetical community of the life table by imposing the condition (which subsists 
in the life table) that the total deaths from all causes at all ages shall equal the radix 
of the mortality table—the total deaths from all causes at each age (namely dx) which 
correspond to that radix being determined by least squares from a series of observation 
equations which arise from the frequency curves of {Hi) and the assumption (iv). The 
very fact, however, that the ratios in (ii) are independent of the numbers at risk 
necessarily renders any such method unsafe; for it is easily conceivable that in two 
communities, A and B, the actual rates of mortality in B might, fur example, be k times 
those in A and yet their proportionate death ratios might be identical—in which case 
the relation q" — ought to emerge, whereas Fisher’s method of proceeding from 
the identical ratios as the basic data would produce exactly the same qx for A and B 
(except so far as either or both qi or qi found by his method might be altered by the 
uncertain process of grouping differently the various causes of death) (see also par. 
146 here, and the reviews in J.I.A., LIV, 206, and J.A.S.A., XIX, 114). IThe entirely 
different principle illustrated by Prof. Karl Pearson in his “Chances of Death” must 



Construction of Mortality Tables from Population Statistics tl5 

For the puri)oscs of this Stucly it is not necessary to describe all the 
early attempts to construct mortality tables bci^ausc in most cases they 
merely provided technically deficient estimates based on unsatisfactory 
material, and they arc now quite obsolete. Tt may be of interest, however, 
to record the followin.u;: (1) Tables of Cx were estimated by the Roman 
jurisconsult Macer, and by Ulpian, about the 3rd century, and Ulpian’s 
table was used in Italy until the end of the 18th century (see J.T.A., VI, 
313, and XXXIV, 159, and II. JI. Wolfenden’s review in T.A.S.A., 
XXVII, 470, of C. F. Trenerry’s “Hie Origin and Karly History of 
Insurance”)- An examination of Ul])ian’s table led M. (Ireenwood to 
conclude (in J.R.S.S., CIII, 216) that Ulpian had sinij^y inteq)olated 
between values which had been taken arbitrarily for the expectation of 
life at ajres below 30 and at 60. (2) Graunt’s classic work in 1661 on the 
London Bills of Mortality (sec par. 17 here) may be considered to have 
foreshadowed the modern mortality table; Graunt clearly seems to have 
realized that he could form a life table by a summation of deaths, and — 
as Greenwood remarked in J.R.S.S., XCVl, 79- “that was Graunt’s 
discovery, and the only contemporary who realized its immense im¬ 
portance was . . . Halley.” (3) In the United States Barton published a 
fra;zmentary table based on the mortality of part of Philadelphia in 1782 
and 1788-*^); and Wi^'jrlesworth’s Massachusetts Table, constructed in 
1793 from deaths alone, wjis used as an authority in Massachusetts courts 
for many years (sec J.A.S.A., 11,6.58, and “Lcn.^lh of Life; A Study of the 
Life Table” by 1-,. T. Dublin and A. J. Lotka—reviewed by H. H. Wolfen- 
den in T.A.S.A., XXVII, 240 -for additional details of these and other 
early tables of doubtful validity). 

(J) Construction of Mortality I'ables from Census Returns Only 
83. It is clear that if the average age of a {population group Pi be as¬ 
sumed to be X H- 5 , and that of the corresponding group at a census 
n years later be x + « + J, then if there has been no disturbance from 
migration, and if the birth and death rates have been uniform, the ratio 
Pi\^/Pi may be taken as npz^h (see also footnote, par. 69). If, however, the 
group has been increased by a number of immigrants /, and decreased by 
emigration amounting to £, during the n years, and if D* deaths among 

not be confused with Fisher’s method. Pearson simply showed the possibility of 
splitting the death curve of the life table itself into a series of superimposed fre¬ 
quency curves—and no suggestion was made of reversing the process, cither directly 
or by the use of ratios as in Fisher’s method. A brief summary of Pearson’s analysis 
in the case of English Life Table No. 4 is given in “The Fundamental Prindples of 
Mathematical Statistics,’’ p. 315]. 



116 


Popnlation Stathiics and Their Compilation 

the net migrants I — E have occurred so that the surviving migrants at 
the second census are I—E—D\ then npx\\ will be obtainable from the 
expression 

(l-E -/)') 

n 

and if, as is usually the case in practice, the enumerated ijopiilat ions have 
arisen from an irregular series of annual births, it will also be necessary to 
estimate their rates of increase in order to combine them with the above 
expression and obtain npx+\ from a population which may be assumed 
to be approximately stationary. 

It is, however, very difficult to miike these adjustments. The statistics 
of migration are frequently incomplete, and are seldom available accord¬ 
ing to age; the deaths // among them, although comparatively small, 
would usually have to be estimatecl; and the rates of increase of the popu¬ 
lation groups are not easy to determine, and generally are not uniform. 
A detailed example of a correction of this type for migration was given by 
H. G. W. Meikle in his report (noted in par. 34(/i) here) on the Indian 
census returns of 1921 although under the particular circumstances of 
the Indian populations at that time he concluded that the effect of migra¬ 
tion u|ion the age distribution was unimportant, and that it was even 
less significant with regard to the rales of mortality which cventufilly 
emerged. 

84. Notwithstanding these difficulties, the construction of approximate 
mortality tables from census enumerations only* (without any tabula¬ 
tions of the deaths) was undertaken of necessity in the United Stales 
prior to the establishment of death registration Levi W. Meech having 
computed and published such tables in the second edition of his “System 
and Tables of Life Insurance.” In India also the methofl has been em¬ 
ployed extensively for the same reason. 

The principles have been illustrated in the case of India by G. 1". Hardy 
in three reports epitomized in J.I.A., XXV, 217, by T. G. Ackland in 
J.I.A., XLVII, 315, and by H. G. W. Meikle and L. S. Vaidyanathan in 
their rejwrts on the censuses of 1921 and 1931 already noted in par. 34(6). 
Complete details of the various processes employed are given in those 

* As noted in the footnote to par. 17 here, the United States census schedules from 
1850 to 1900 included questions (though with unsatisfactory results) with resi)cct to 
deaths in the year preceding the census. In the report on the census of 1860 (see also 
J.T.A., XIIT, 289) Dr. Kdward Jarvis consequently gave (ji. 524) a life ta)>lc for whites 
based on that census and the deaths so reported to the enumerators; hut although his 
table thus was based on census data only, the method of con.struction clearly utilized 
death statistics and therefore did not follow the principles under discussion here. 



Constrticiion of Mortality Tables from Population Statistics 117 

publications. Fn Ackland’s investigation, for example, wliich related to 
tlie dccennium 1901-11, a preliminary correction for errors of age (see par. 
52 here) was first used. Having thus obtained a more reliable age distribu¬ 
tion for both the 1901 and 1911 censuses, per 100,000 of each sex, the 
arithmetic mean was taken as the mean population, by age groups, for 
the i^eriod 1901-11 -corrections being introduced in the cases of Madras 
and the United Provinces to allow for the efTects of emigration upon the 
mean i^opulation figures thus ascertained. These mean iX)i)ulations were 
then graduated—the values thence derived at each age giving a graduated 
mean population at each age for the period 1901-11. In order now to de¬ 
duce the rates of mortality, the corrected age group figures for 1901 and 
1911 were compared and the rate of increase of the population over the 
decennium was obtained for each age group—the rates so found being 
graduated to give r^, the graduated rate of decennial increase at age x\ 
and since the gnaduated mean population for 1901-11 as previously found 
represents ai)proximately the ixtpulation of correct age distribution at the 
middle point of the decennium, it was then possible by multiplying and 
dividing by r\ to obtain the ix)pulations at the same age in 1911 and 1901 
resfxictively, from which lo^z+i follow’s directly, and thence by inter- 
|X)lation. A simpler alternative process, which was used by Ackland for 
some of the Provinces, and was adopted in the subsequent reports of 
Meikle and Vaidyanathan, is to multiply and divide by r* instead of rj, 
thus obtaining the estimated population at each age six months after and 
six months before the middle fxjint of the decennium, from which is 
obtained directly and thence px by interfwlation. 

Such methods, however, are not often require^l under modern condi¬ 
tions; and in practice they can give only approximate results. 

(j) Construction of Mortality Tables from Death and Census 
Returns- -Supplemented Frequently by Birth Returns 
at the Infantile Ages 

85. The more usual method of construction, which is adopted whenever 
the data permit, is to employ both the registered deaths and the census 
returns in order to obtain w, (or sometimes ^,) directly from the observa¬ 
tions, without the necessity of introducing doubtful assumptions as in the 
preceding methods which use either death or census returns only. As 
censuses are usually taken at decennial intervals, many tables have been 
constructed in the past on the basis of the mean populations living 
throughout the intercensal period and the corres|)onding ten years’ 
deaths, in order to avoid the danger of reflecting unduly the fluctuations 
in mortality which may occur in particular calendar years. With improv- 



118 Population StatislUs and Their Conipilaiion 

ing mortality, however, such tables will give rates of mortality higher 
than those prevailing in the last years of the decennium. When the data 
are of sufficient extent, it is therefore preferable to base the investigation 
upon the results of one census only (with tlie incidental advantage of 
thereby avoiding the calculation of the mean population over a ten-year 
period) and the deaths for, say, two or three adjacent years, so long as 
those years have not been abnormal by reason of wars, epidemics, or 
excessive migration. 

Tn now considering the details of the various methods of construction 
which fall under this heading (3), it is desirable to take up first (A) The 
Infantile Ages, then (B) The Adult Ages, and finally certain sui)plenicn- 
laiy methods for (C) The Oldest Ages, and the “Juvenile” Ages (i.e., 
those between the Infantile and Adult Ages). 'Fhe range of ages included 
by the term “infantile” will, as in par. 78, be taken as the five ages from 
0 to 4, last birthday, inclusive. 

(a) INFANTILK AOKS 

86. On account of the unreliability of the numbers enumerated by the 
census at infantile ages, as explained in pars. 29-32, it has been customary, 
in the long series of English Life Tables published by the Kegistrars- 
Gencral of Englanrl and Wales, to discard the census statistics and to 
deduce the rates of mortality from the more reliable registrations of 
births and deaths. In the U.S. and (Canada, however, birth registrations 
also are incomplete (see par. 46); and it is then necessary to correct or re¬ 
calculate the births or ix)pulations in order to apply methods as described 
in par. 92 hereafter. 


THE ENGUSII UFE TABLE METHODS 

87. The basic principle of the English Life methods is to accept the 
birth and death registrations as correct and consistent (see i)ar. 24), and 
to calculate qx directly therefrom—^migrations being ignored as unim¬ 
portant at these ages. The constructions in Tables 1 to 8 were based on 
the assumption that formula (18) would give sufficiently accurate results 
(cf. T.A.S.A., XXIV, 151 and 154)—^allowance for the varying births and 
deaths of different calendar years being made implicitly by using the data 
of a number of calendar years. The various ways in which such data may 
be arranged can be seen from the following classifications (taken from 
H. II. Wolfenden’s paper on “The Determination of the Rates of Mor¬ 
tality at Infantile Ages, from Statistics of the General Population,” 
T.A.S.A., XXIV, 127-31): 

If bn denotes the births in the nth calendar year and n^x the deaths aged 



Construction of Mortality Tables from Population Statistics 119 
X liist birthday in the nth calendar yea.T, then on these assuniptions 


The Ilirths 
1 (b, + Aj) 
iibi + bi) 

J (Ai + bt) pmduce 
I (*4 -I- A.) 
l(bs\-h,) 


The (?firrespiinilini; Deaths 
3f/i Wa s/Ai f//4 
3//q ^r/i G(/2 a^/a idi 
efU T(h 

‘jio sfli T^h vfh 8^/4 
1/^0 7t/i tfh Itf^A 


The rates of mortality for the first live years of age may now be de¬ 
termined from sucli data in a number of \va 3 rs, which may be classified as 
follows: 

88 . Farr's Method, (t) The principle employed by Dr. Farr in I he 
l^nglish T.ife Table No. 1, in the Healthy ICnglish Life Table No. 1 (see 
J.I.A., IX, l.H, and XLTl, 229), and in the English Life Table No. 3 was 
to proceed, in the above scheme, directly a/oag the line, thus: 

_ _ 4^/2 

■^(6 .+a2 )’^‘ r(6.+w 


t/3 = 


_ __.. 1 _ 

A2) ~"2rfo““3<fl“"4^2 ’ W ■~2do’”3^/l”“4tf2“5rf3 


'JI 1 C several values of </ were thus determined from the same group of 
births, by employing successively the deaths of five dilTerenl calendar 
3 X‘ars which appear along the line in the above scheme. In order to obtain 
a wider basis than is given by the data of only one line, several lines may 
of course be combined as in the English Life No. 3; or, as in the Healthy 
English Table, the average of the results of several lines may be employed. 

(zi) Instead of finding z/o, qu etc., directly, as in (z), by involving the 
deaths in the denominators, the values of z/u, 11 z/n, 21 ^u, etc., could be found 
ns follows—the values of z/ being easily deducible therefrom: 


z/o = 


idn 

H^i +W' 


1 I z/« “ 


3zfl 

i(6i+y 


. I, — 

’ hibl+bn) ’ 


and a wider basis could similarly be obtained by combining the data of 
several lines or by averaging the various results. 

89. The Methods of English Life Tables Nos. 5 to X. Karr’s method, as 
already fjointed out, employs the deaths of successive years, and conse¬ 
quently it may be objected that it will not produce the rates of mz)rtality 
of any particular period. Another method of selecting the data was there¬ 
fore employed in several later tables, by which the deaths are taken along 
the diagonal in the scheme of par. 87 (instead of along the line) so that for 
each age they are supplied by the same calendar year. 



120 


PopukUion Statistics and Their Compilation 

(0 When q is calculated directly by this means the formulae may there¬ 
fore be written down at once as 

” Hi4+ W -srfo ’ " FC Ja+i/) -4rfo-5rfi' ’ 


— nnHrr — 

i ( *2+ W -8‘d0-4rfr-6rf2 ’ h(bl+ W -2rfo-idl-4d2-5d3 ‘ 


The principle of this method—extended to include a number of calendar 
years* deaths—was used in the T-iondon Life Table, and the English Life 
Tables Nos. 5, 6, 7, and 8 (see, for example, Supp. to 7Sth Report of the 
Registrar-General, Part I, p. S).* 

(it) This same method of proceeding along the diagonal may again be 
applied to the calculation of ^o, 11 go, etc., from which the values of q would 
be deduced—the formulae being: 


go = j-y 


tdo 


lih+b,) 


il ^0 = 


edi 




11 </n - 1 


2^4 


hibi+b,) 


This is tlic principle used by Moors and Day, J.LA., XXXVI, 167, and 
also by C. 11. Wickens, XLIII, 74. 

90. PeWs Method. The method of Dr. Farr, which proceeds along the 
line in the scheme of par. 87 and so requires the extraction of the births of 
two years and the deaths of five calendar years, and the method just 
stated in par. 89, which proceeds along the diagonal and so emphiys the 
births of six years and the deaths of one year, may clearly be combined 
and extended without the necessity of examining the data of any further 
calendar years. The data so combined will be included within Hie triangle 
formed by the top line, the diagonal, and the left side; and the formulae. 


* On the supposition that the total of the census jiopulatiuns under S was correct 
though erroneously distributed, an adjustment was used in the London Life 'Fable and 
Knglish Life Tables Nos. 5 and 6 by w'hich the total **numbers living” under 5, as 
computed from the birth and death statistics on the principles of par. 89(/), were 
brought to coincide with the numlicrs enumerated by the census (sec the otilcial vol¬ 
umes of those tables; Newsholmc’s Vital Statistics (3rd Edition), p. 274; 'F. E. Hay¬ 
ward, J.R.S.S., XLII, 451; J. Buchanan, PrcxTccdings 6th International Congress of 
Actuaries, IF, 610; and G. King, Supplement 75th Registrar-General’s Report, I, 5-13). 
In English I..ife Taldc No. 8 and later tables, however, this adjustment was discarded 
because King showed clearly that it reduced the “numbers living” erroneously, and so 
produced an overstatement of mortality, since the census populations (instead of being 
inaccurately distributed) were actually deficient at ages 0 and 1 (see also H. II. Wolfen- 
den, T.A.S.A., XXIV, 131-34, and pars. 29-31 here). 



Conslruclion of Mortality Tables from Popidation Statistics 121 

which are most compactly stated for the tleferrcd probabilities, are dearly 

__2rfn + a</o + 4</,i-|-irfn + c</,, _ 

H fti + *2)' + ) Tbi + 6s/ +. , . H- J ( 6, -I- 6,) ■ = 

I ___3(/l + . ■ .+ Brfl 

* * ^ (6i + 6s) - 4 -.. . +1 (6| - 4 - 65) ’ 

I __ _ idj + idi + sdi 

*' Hbi + W +. /: + H62 + f >.0 ’ 

I __ s«/a + *d3 ^ 

ifAi'+W+J(''6s+6,/’ 

and 

I 6^4 

These arc the formulae of Professcjr Fell’s method, XXI, 264 (sec 

.also T.A.S.A., XXIV, 1.^0). 

91. The Methods of English Life Tables Xos, 9 and JO. -All the pro¬ 
cedures outlined in pars. S7-<J0, as used in Knjdi^l^ I-ifc Tables 1 to 8, em¬ 
ploy the basis of formula (18) with its assumption of uniform movement 
over the year of age. In < onstrucling I'lnglish Life 'I able No. 9 based on 
the deaths of 1920 22, and tlie Northern Irelaml I.ife 'I’ables, 1926 (for 
the references see par. 120 here), the vicilenl Ihicluations in the numbers 
of births during and after the war of 1914 18 neci*ssilatcd the abandon¬ 
ment of the uniformity assumpti(»n (»f formula (18); and as returns of 
births were available for ea('h (jiiarter c»f each calendar year, it was .as¬ 
sumed that the births in each (jiiarler were distributed uniformly so that, 
on the principle of par. 89 (/) suit.ably moditied, the formulae were 


where .1 = }•5^Jj''+7,y,''‘ 

and so on, where for example rciirosents tlie births in llie ;/tli r|uarter 
of 1919, and j 8 ‘* denotes the births iluriiig the wliolc of 1919. 




122 Poptdalion Siatisiics and Their Cofnpilation 

For I^nglish Life Table No. 10 based on the deaths of 1930-32, and the 
corresponding table for Scotland (see par. 120 here), the preceding method 
was again adopted at ages 1 to 5; at Jige 0, however, a further elaboration 
was introduced by using quarterly deaths as well as quarterly births the 
probabilities of death being determined for each quarter of the year of age 
and then summed to give Denotingby Dfi/a, 7?5/r„ ..., the deaths in the 
first, second, . . . , quarters of the year of age, and by ^o/ 3 > • • • > Ihc 

corresponding probabilities of death, the formulae were 


^ 3/6 = 


..1980 , j>193l , y>1932 

/^3/fi + ^3/6 + />3/6 


iP? + 


and so on.* 


THE MKTirOnS developed in the ITNITED STATES 
92. Tn llie United States and Canada (as pointed out in pars. 45 47) 
birth registrations have always been incomplete, and death registrations 
* It is sometimes desired to compute values of from the experience of a period 
shorter than a year, sucli as a quarter or a numtli. (Quarterly infant mortality rates 
accordingly have been published by the Registrar Cleneral of iMigland and Wales, and 
monthly rates are published by the National Olliee of V'ital Statistics in the United 
.States. The Registrar-Generars method (which is alsf) employed liy the National Olliee 
of Vital Statistics) for computing these rates consists in allocating the infant fleatlis in a 
particular month or quarter, sulnlivided by age at death (under 1 ilay, 1 da\', 2 days, ^ 0 
days, 7-13 days, 14-20 days, 21 days to 1 month, and thereafter for each month) to the 
month or quarter of birth by means of factors (varying slightly acerirding to the num* 
her of days in the month); the infant deaths so allocated to each month or quarter are 
then divided by the number of births occurring in that month ortpiarler; and the ratios 
arc summed to represent the monthly or quarterly rates (see the S-^rd Annual Report 
of the Registrjir-Gencral for England and Wales, and the U.S. Vital Statistics report 
by Moriyama and Greville noted in par. 79 here), ’’rtiis methoi], however, is somewhat 
laborious; an abridgment has therefore been used by DePorle for computing on a 
monthly basis. Instead of allocating infant deaths tn the precise month of birth, Dc- 
Porte’s method in effect allocates deaths of infants under 1 month and over 1 month 
of age to the approximate or average month of birth; the deaths under 1 month of age 
occurring during a particular month are divided by the number of births during the 
month, and the infants surviving the first month but dying before reaching their 
first birthdays are divided by the monthly average number of births for the 11-month 
period preceding the month of death; and the tw'o ratios arc then added (see J. V. 
DePorte, “Rate of Infant Mortality Adjusted to a Rapidly Changing Birth Rate,” 
Health News, New York Slate Department of Health, XXI [1944], and discussion there¬ 
of in the paper by Moriyama and Greville where DePorte’s abrirlged method is shown 
to be sufficiently accurate for most practical purposes). 



Construction of Mortality Tables from Population Statistics 12.? 

may not be wh<illy reliable in all are;is. The enumerated pojmlations, 
moreover, as in other rountrics, are deficient at ages 0 and 1 (see pars. 
29 32). Under these circumstances the methods of pars. 87 -91 are not 
appnipriate, and special procedures have been devised on the basis of the 
formulae stated here in Section VI. The problem, in general terms, is to 
comi)ute qx at each age from 0 to 4 from some arrangement of («) the de¬ 
ficient births, {b) the deaths, DJ, which often are assumed to be cor¬ 
rect, although in certain areas deficiencies may exist; and (r) the census 
[)oj)ulalions, which are deficient at ages 0 and 1, but generally are re¬ 
liable for ages 2,and 4 in total. 

(/) In the U.S. Life 'fables 1890,1«)01, 1910, and l‘H)l 10, the method 
used by J. \\\ (dover followed that given in C.'zuber’s “Wahrschcinlich- 
keitsrechnung,” and was discussed further by R. Henderson in T.A.S.A., 
XXlir, 435, and H. 11. Wolfenden in T.A.S.A., XXIV, 136, and XXV, 
148. 'fhe registered deaths 77; in each year of age, and the enumerated 
|x>pulations at each age from 2 upwards, were assumed to be correct. The 
dealhs were then divided into their slPx «i^d components by means of 
values of whii h were found as in par. 65. 'fhe births were then 
computed directly from the population aged, say, 2 last birthday (or 
similarly from tiiat aged 3 or 4), by formula (6)- - the population be¬ 
ing coni|)ulc«l in the U.S. Life 'fables on the assumption of its own (S.P. 
It was then assumed that, although the absolute values of Pu and Pi are 
unsound, nevertheless the arithmetical increase — P^ for each age 
may be taken as correct the pofmlations at dates other than the census 
date being eslimatetl, in the U.S. Life 'fables, by their own G.P.*s in each 
case, 'fhe data then consisted of a computed number of births /ij, the 
original deaths a/ 7; and „//;, and computed values of — P3, which 
may lie denoted by 5;, at each age. 

'fwo courses are then ojien. \Vc may apply equation (1) in the form 
P;ii A'; 5; — /7; to build up, from the births Ao, a column A; (see 

U.S. lafe 'fables, p. 343, and 'f.A.S.A., XXIV, 137); and from AJ so con¬ 
structed, with //;, and 5; which is assumed to be correct, may be com- 
imted by a formula which again must involve only A, 77, and 6 such as 
(29) with the factor 

p7\2 ' 

taken from known values (see 'f.A.S.A., XXV, 150), or (39) so long as in 
the first year of age at least it is applied by months of age (see par. 80 
here). Or, this limitation on the permissible formulae for may be re¬ 
moved by adding the further simple step of recomputing the separate 




124 


Pflpnltilio/i Shitistirs am! Their Compilation 

values of Px JiikI PJ* ^ from 7?J, 5;, and the /)’s by formulae (2) and (3); for 
by so doini' a table of \ 5i, 7i, anil the 7)’s will be obtained which 

will be consistent throughout, so that any of the formulae for gi may be 
applied to it (see T.A.S.A., XXV, 151). 

{ii) In Glover’s procedure the calculation of the births from the i)opu- 
lations af?ed 2 (or over) by the direct etjuations (6) necessitates the use of 
the deaths of two (or more) calendar years subsequent to the period under 
observation; and it was pointed out by R. Henderson (J.A.S.A., XVllI, 
552) that this resulted in a small understatement of the original popula¬ 
tions aged 2 4 in the U.S. Ijfc 'Fables, with a consequent slight overstate¬ 
ment of the mortality. 'Fhe assum])tw)n that the original increases arc 
correct is also of doubtful validity; h)r it imj^licitlj'^ assumes that each of 
the populations * and is affected by an error of the same amount, so 
that their difference 6; is correct notwithstanding considerable un¬ 
certainty as to the nature of those errors (see 'F.A.S.A., XXIV, 138). 
Henderson therefore suggested a method in T.A.S..‘\., XXITT, 435 (“The 
Adjustment of Population Returns at Infantile Ages in the Absence of 
Jlirth Statistics”) in which it was assumed that the total fiopulations aged 
2, 3, and 4 are correct in total but subject to redistribution, and that all 
the |K)|)ulations increase in a G.P. with ratio r (or in an A.P.) as found 
from those iK)i)ulations aged 2 4. Hence by an obvious application of 
equation (5) a new value of P\ was determined by the formula 

‘ l + r-l-r'-J ■ 

where Pi*' = rP; (in the G.P. method); and then, working backwards 
therefrom by applying formula (5) at each age, the individual values of P 
were established, and finally the births emerged from formula (3). 

This method produces a consistent table in which all the fundamental 
relations (1)- (4) are maintained, S(j that may be calculated from it by 
any of the available formulae of pars. 67 ct see].- preferably by No. (11); 
and it has the advantage that the recf)mputed births and corrected popu¬ 
lations arc determined from the data of the observation {period alone (see 
T.A.S.A., XXIJI, 43‘); XXIV, l.SO and 153; and XXV, 151). 

{iii) llenderscm’s process used the same value of r throughout. In the 
construction of the U.S. Abridged Life 'Fables, 1919 -20, which were pre¬ 
pared by Miss Klbertie I'oudray,* it was noted, however, that the birth 

* It may also l)e of intiTcst to note here that other methods devised by Miss Foudray 
in the preparation of United States life tables for 1920-29 and 1930-39 are discussed, 
with particular reference to corrections at ag&s over .S for net migration and incon¬ 
sistencies due to age or other errors, by T. N. E. Greville in K.A.I.A., XXXI, 370-73. 



Comtruciion of Mortality Tables from Population Statistics 125 

rates had varied ronsidiTahly for several calendar years, and it was there¬ 
fore concluded that the assumption of a uniform G.P. or A.P. would not 
he permissible. Miss Foudray consequently tleviseil a method under which 
the deficient i)opulations were first corrected on the assumption of a 
fixed perceiitsiKC deficit of 9^'/ for whites and 25^'/ for Negroes (see [>ar. 51 
liere), tlie births F/^ thence following by formula (5). This recalculated Eq 
for 1919, in comparison with the registered births of 1919, then gave the 
percentage by which llie registered births of 1919 were assumed to be de¬ 
ficient ; and this percentage deficiency was then assumed to be applicable 
for the calculalion of the true births from the registered births of previous 
calendar years. I'rom these corrected births of ]>revious calendar years and 
the registered deaths of those years it w'as then possible to calculate PJ 
and ^ at each age by subtracting the appropriate deaths from the births 
in accordance with formula (6). 'fhe column PJ then followed by formula 

(1) . In ajiplying this method Miss Foudray computed only the increases 

or — y^i, at each age, and w’as therefore restricted to the use of 

formula (29) for r/J which does not involve the actual P*s when the factor 
ill the denominator is taken from known values. As |X)inted out by 
II. II. Wolfenden in I'.A.S.A., XXV, 150, however, this restriction is not 
necessary; for the separate values of and y’J emerge directly by the 
above firoccss, and as they satisfy the fundamental relations (1) (4), 
may be computed by aii)' formula preferably by No. (11). 

(«') In preparing the U.S. Life Tables, 1959-41,1'. N. K. (ireville used 
the true Type 111 formula (11). After due e.vamination (sec pars. 46 and 
47 here), it was assumed that the ilegree of under-registration was the 
same for births and infant deaths, so that go could be derived from the 
registered data without adjustment. The deaths were therefore 
divided into their */?; and components by using the factors/i deter¬ 
mined as stated in par. 65 here. Then 1% follows from the previous year’s 
births ZiYr^ !*nd deaths by formula (5); Ei — by formula 

(2) ; ^ = Eii — J)n by formula (5); ami these values with Zi;', give by 

formula (11). 

Before proceeding to the higher ages, the values of E\ so found were 
adjusted for under-reporting, 'fhe degree of adjustment required was 
computed with (luc regard for the fact that at ages 5 and beyond the 
census data and registered deaths were assumed implicitly to be incom¬ 
plete by the same percentage, so that at those ages they could be used 
without correction. The factors for adjusting El were obtained by estimat¬ 
ing Pi from the births between April 1,1950, and March 51,1957 by two 
independent methods based on evident applications of formulae (2), (5), 
and (6), namely: 00 by subtracting the appropriate rci)orted infant 




126 Population Slaiistics ami Their Compilation 

deaths occurring among the reported births of that period, and (&) by 
adding to the census populations at each age from 3 to 9 the reported 
dcatlis this group experienced in previous years in passing from age 1 to 
the census dale. The adjustment factor, formed as the ratio of (a) and 
(6), was divided into E\ computed from the recorded births and deaths. 
From the value of E\ thus adjusted, correctctl values of and PJ for use 
in formula (11) at ages 1 to 4 were then derived on the assumption that 
the deaths could be taken as correct. 

In comi)aring these several methods, it will be noted that Glover’s 
procedure computes the births from the supposedly correct |x>pulations 
aged 2 and over and the deaths in calendar years subsequent to the ob¬ 
servation period, and produces an approximate Type 111 Henderson’s 
method calculates the birtlis from the data of the observation period 
alone, and gives the correct TyjK; 111 Miss Foudray’s process reejuires 
the birtlis and deaths of calendar years prior to the observation period, 
and finds an approximate Type TTl (although more correct values can 
be obtained as shown in T.A.S.A., XXV, 149 -52). Grcville’s approach 
also uses the births and deaths of calendar years prior to the observation 
period, and finds the correct Type Til i/J. 

93. (0 As a result of the nature of the adjustments at infantile ages 
described in pars. 87-92 it is generally sufficient to adopt the values of 
which emerge therefrom without any further graduation. W'here gradua¬ 
tion is found to he desirable Makeham’s second formula has been sug¬ 
gested (see “Investigations Concerning a Law of Infantile Mortality,” 
Australian Assexiation for the Advancement of Science, XIV [1913|, 526, 
and Vol. T, Statistician’s Report, 1911 Census of Australia, p. 325, by 
C. H. Wickens, and Greville’s comments in the U.S. Life Tables and 
Actuarial Tables, 1939 41, p. 137). Two other modifications of Make- 
ham’s first formula to provide for the rapid change in values during the 
infantile ages have also been projxised by J. F. StefTensen and F. S. Harper 
(see “The Fundamental Principles of Mathematical Statistics,” p. 80). 

(if) In proceeding from qx to the customary life-table functions, L 
and dx are calculated from an arbitrary radix in the usual manner. The 
function however, which at ages 5 and over where/* = ^ is taken with 
sufficient accuracy as 

Ix + lx+l 
’“2 ’ 

must be computed at the infantile ages where /, 9 ^ J as Ixh + fidx or 
/*—(!— fx)dx in accordance with formulae (2) and (3) respectively 
(see also pars. 65 and 66)—these expressions also being equivalent to 



Construction of Mortality Tables from Population Statistics 127 

Lx —fJx + (I — fx)I For Zo, moreover, wlierc data are often available 
for various subdivisions of the first years of life, a more accurate value can 
be obtained by making separate calculations for tliosc subdivisions by the 

formula Tx — 7 x+< = 2 adding the results (as in the U.S. 

Life Tables, 1939 41, p. 133). 

(Hi) Another special problem is that of determining values for the 
force of mortality, /x, at ages 0, 1, and 2, because the usual appro.xiniatc 
formula 

.8 a, 1 -/xm)-(/x 2-/X+2) 

*** ‘ ■ 121/ 

is inapplicable for jr = 0 and 1, and at age 2 is unsuitable since it involves 
/o- In the U.S. Life Tables, 1939 41, p. 137, /lh and /xa therefore were found 
by Waring’s (Lagrange’s) formula (see footnote, par. 52) from the live 
unequally spaced values /j j, /i, /a, /», and U for /xa, and /jy, /||, /i, /a, and /a for 
Mi¬ 
ll! preparing those tables, also, Greville paid partic ular attention to the 
estimation of a realistic value for no (which has some academic interest 
even though its practical utility is slight). After a review of methods pre¬ 
viously used for the calculation of mo in oflicial rej^rts in Australia and 
Belgium, and clscw^hcre in actuarial literature by King and Spurgeon ■ 
all of which seem to have underestimated the value seriously -the pro¬ 
cedure finally adopted was to fit a Gomjxjrtz curve to the lx values at 
birth and at the ages of 1 day and 2 days in order to give effect to the 
extremely rapid decrease in the dcjitli rate immediately after birth. 

94. In the procedures of pars. 87 -91 for the Knglish life tables and 
pars. 92(f) (iv) with respect to those constructed in the United States, 
migrations have always been ignored as unimijortant. Among the firoc- 
esses devised for the 1939 41 United States life tables as described in par. 
92(2»), however, a final adjustment was made for immigration at ages 1 to 
4 since the deaths recorded at those ages may include some deaths of chil¬ 
dren who entered the country as inmiigrants. The method which was used 
as a sufficiently close approximation was to multiply the computed mor¬ 
tality rate at each age by the ratio of the native j^opulation to the total 
population at that age. This ratio is very close to unity, and the effect of 
the adjustment was negligible. At age 0 this method is not ai)propriatc, 
because the small amount of immigration which occurs is believed to be 
heavily concentrated in the latter fxirt of the first year of life, while the 
mortality is very much heavier in the early part. The e-xpedieiit was 
therefore adopted of applying the adjustment ratio only to the proba¬ 
bility of death, for the second portion of the first year of life. These 



128 Population Statistics and TImr Compilation 

adjustments at ages 0 to 4 might result in a slight understatement of the 
mortality rates as they assume that no emigration occurred, although this 
is partially ofTsel by the fact that the number of deaths subtracted from 
the births to obtain the ex|x)sed to risk may include some deaths of im¬ 
migrant children (see T. N. K. Greville, “U.S. Life Tables and Actuarial 
Tables, 1939-41,” pp. 119 20). 

(]l) ADULT AGES 

95. The preceding methods of treating the data at infantile ages are 
necessary by reason of the unreliability of the census returns at those 
ages. For the later ages in accordance with the principles of pars. 53 and 
54 here, and sometimes after ]>rcliminary adjustments for errors of digit 
concentration of the tyi)es discussed in par. 52 --the census returns and 
death statistics can usually be taken as being ay)])roxiinatcly correct in 
certain sixjcifiefl (juiiKjuennial or decennial age gnnips. I'urlhermore, at 
ages above 4 last birthday (as stated in pars. 65 and 78),/; may genendly 
be taken as 2 , so that a uniform distribution of deaths may be assumed 
and Wz may be found directly by dividing the calendar year’s deaths by 
the mean iK)pulation in accordance witli the usual nix — dx/Lx of the life 
table (and the analogous formulae (27), (33), and (36) hereof for yi). The 
problem to be dealt with at these later ages is consci|uen!ly that of re¬ 
distributing the approximately correct (|uin(|iiennial «»r decennial group¬ 
ings of the deaths and {Xipulations into the value's at each age, in such a 
manner that the irregularities at individual ages will be removcjd without 
disturbing unduly the totals in each group. 

The various methods which have been develoix'd in constructing the 
best known jjopulation tables will therefore now be reviewed in chrono¬ 
logical order. Of the methods to be thus described, the student at the 
commencement of his reading should note ])aiticularly the following ap- 
prjiisal of their present comparative utilities: (a) 'Fhe graphic method 
(par. 96) may still be useful under some circumstances. (6) harr’s meth¬ 
ods (par. 97) are now of historical interest only, (c) I'lie later Fnglish Life 
Table methods (par. 98) intniduced im|X)rtant basic ideas in the “curve 
of sines” and the handling of grou|)ed data. (</) The tangential and oscu- 
latory reproducing interix)lation formulae on Sprague’s principles (pars. 
99 103) and Henderson’s principles (pars. 104 6), and the tangential and 
osculatory non-rcproducing formulae on Jenkins’ principles (par. 107), 
have been used so extensively and have become so well established as al¬ 
most routine procedures in many countries because they have produced 
sufficiently good results with considerable facility, that their continued 
use may still be anticipated in some quarters notwithstanding their 



Coustniction of .\fortdUy Tables from Population Statistics 129 

demonstrable and acknowledged weaknesses, (e) The more flexible inter¬ 
polation and filling methods develo|je(l by Reid and Dow, Kerrich, and 
Orevillc (pars. 108-11) are imiwrtant as being essential links in the 
theoretical development of the general problem, and as aiTording im¬ 
proved practical tcchni(iues, although they have not been adopted widely 
in practical work on account of their comparative complexity and the 
previously established popularity of the simpler methods of pars. 99-107. 
(/) 'Fhc newer method of apj^lying linear compounding coelTicients to 
effect reproducing or non-rejiroducing inler|X)lations minimizing the mean 
square error in a specified order of differenct'S (pars. 115 17) involves a 
simi^le cf>mpulalion routine and nui}^ be expected to show impmved re¬ 
sults; the recent censuses in .‘several countries may afford opportunities for 
its further exploration. (/^O The practical usefulness of the theoretically 
interesting “interlocking” formulae (par. 118) remains to be established. 
(//) The idea of re|)roducing subtabulation minimizing the sum of the 
squares of all differences of a given order by iliffcrence-C(|uation ojwra- 
tions (|)ar. 119), while again theoretic ally interesting, does not seem likely 
to produce sulficiently inifiroved results to justify its comj)aralively 
laborious working j)rocesscs. 


(/) (Graphic Method 

96. A grai)hic treatment was naturally one of the first methods to be 
ap|)lied on account of its ap[)arently simple nature. 'Phe deaths and fxjpu- 
lalions ifi the age grou|)s can be represented sci)aralely by a scries of rec¬ 
tangles, from which the values for individual years of age may be ascer¬ 
tained by drawing a cairve through the tops of those rectangles in such a 
manner that the areas representative of each age gmup shall be unaltered. 
By then reading the values for each age from the death cur\'e and from the 
population curve, nix is obtained directly by dividing the former by the 
latter, and thence px follows by the usual 


or pr may be obtained directly 
relation 

P. 


from the deaths and jxipulations by the 


ILx-dx 
ILx + dx' 


I'his graphic method was used originally by Milne in the construction 
of the Carlisle Table (sec G. King, J.I.A., XXIV, 186, and XLIT, 226, and 
see J.r.A., XVT, 221, for an interesting account of the life of Dr. John 
Heysham, compiler of the data used by Milne). It was also illustrated by 



m 


Population Statistics and Their Cmnpilation 

Burridge (J.I.A., XXIIl, 309, and XXIV, 333), Moors and Day (J.T.A., 
XXXVT, 151), and Grant (J.I.A., XL, 125). Notwithstanding its apparent 
simplicity, however, it is difficult to read the individual values from the 
curve with sufficient accuracy. It may therefore be desirable in some 
cases to represent the data approximately by a mathematical formula, 
such as by a Makeham function (sec par. 125 here); and then, using this 
as a base line, to graduate graphically the differences between the result¬ 
ing series and the original data (see G. F. Hardy, J.I.A., XXV, 229). A 
further graphic graduation of the values of w* or pz as determined from 
the fX)pulation and death curves may also be advantageous (cf. J.I.A., 
XXIir, 321, and XXTV, 203 and 339). 

(I’O Dr. Farr's Metiwds 

97. In the construction of the earliest English Life Tables tlie deaths 
and populations were available in the age groups 5 9, 10-14, thence 
decennially to 94, with a final group for ages 95 and over; and Dr. Farr 
assumed that by dividing the deaths by the jx)pulation in these groups the 
resulting function, which is analogous to the 

Is Is+n 
Tz-Tz7n 

of the life table, could be taken as the central death rate for the year of age 
central to the group, namely or as the force of mortality for 

the central point of age, that —these two functions being as¬ 

sumed to be identical. It was therefore supfwsed that the data in these 
age groups would thus yield immediately the numerical values of /» 7 , ^»i 2 i 
w»i9.& (or /x 2 o), W 29 B (or iuso), and so on. 'Phe values of pt and p \2 thence fol¬ 
low by the formula of par. 96 a special adjustment being made to the 
value at age 12 “to allow for the turn of the curve” (J.I.A., IX, 135). At 
the subsequent ages two different methods were used in passing to p 2 o, 
^ 30 i etc. In Farr's First Method^ which presumably was used by him in 
preparing English Life Tables Nos. 1 and 2 (see J.I.A., XLTT, 288) it was 
assumed that W 29.5 = from w'hich WI 20 was obtained as W 19 .Br*, and 

the subsc({ucnt intervals were treated similarly—^^ 0 , etc., thence fol¬ 
lowing as usual. In his Second Method^ which was applied to the English 
Life Table No. 3 and the Ilealtliy English Table No. 1, the values of ^ 0 , 
etc., were determined by calculating log pz directly from the supposed 
values of tis by the formula 

1... . x»(r-l) 

- 



Construction of Mortality Tables Jrmn Population Statistics 131 

which was obtained as follows: Assuming (as in the similar assumption 
with regard to m in the first method) that /ixno = r'Vx, so that = 
r*/xx as in Gomixjrtz’s formula, it follow^s, since 

J r* r — 1 

' Mx \ tdl , that colog^ = / li^r^dt-- -/x,; 

0 -/o lOge r 

from which the above formula results, k being the modulus. The loga¬ 
rithms of ply pity ptxsy ptQy etc., were then made the subject of interpolation 
by ordinary third dilTerences to get the values at each age. 

The principal objection to Farr’s method lies in the inaccuracy of the 
assumption that, in effect, 

lx lx in 

T7-Tx7n 

can be taken as an approximation to m or /z , ?; for, as shown 

'”•‘2 * 2 

by King in J.I.A., XLTT, 232-33, that assumption “leads to erroneous re¬ 
sults of serious magnitude.”* His method may now be considered as of his¬ 
torical interest only. 

(///) Later English Life-Tahlc Methods 
98. Some years after the production of J^nglish Life Table No. 3, No. 4 
was constructed by applying to the rates of mortality of No. 3 the ratios 
in which the mean annual death-rates shown by the data of the two tables 
had altered (see J.I.A., XXVTT, 494; XXTX, 20; and XLII, 288). A com¬ 
plete change of method, however, was subsc(]uenlly made in the prei)ara- 
tion of tlie next tables—namely. Nos. 5 and 6, the correspomling Healthy 
l^nglisli Tables Nos. 2 and 3, and the l..oiidon Life Table. 1'he i)rocedures 
which were then adopted are generally associated witli the names of 
J)r. John 'latliam and A. C. Waters (see J.I.A., XLII, 234-35). They are 
entirely free from the questionable assumptions made by Dr. Farr. 

* Dr. Al. (ireenwood in j.I.A., LVIFb 15.?, lumevcr, cxprc-ssnl llic opinion that 
“ihis seems to do some little injustice to the old tables.” He slated that in 'J'he Lancet 
(1022), ir, 719, he had computeil the expcTlalions of life at various a^cs l»y Kind’s 
method for comparison with Farr’s, with the hillowin;; results: “At late a^es the error 
was, of course, relatively lar^c --for e.\ample, at a^e 7.S there was an ern)r of nearly 6 
months in the expectation of life; hut at an age as late as 45 the elTcM:t of the correc- 
tum [hy King’s method] was to alter the expectation of life by 22 days, and for the 
purposes for which Dr. Farr used those tables perhaps the error was not very serious.” 
V. I*. A. Derrick has also pointed out in J.I.-V., LVIII, 159, that the underslatemenl of 
mortality by Farr’s method at the older ages “was comi>ensafe»l by the yirobabilily 
that the reconls themselves already overstated mortality through age misstatements 
in the original data, so that really Dr. Farr’s mortality rates at these ages might be 
regarded, in his opinion, as superior to those given later by the improved method of 
Mr. King.” 




132 


Population Statistics and Their Compilation 

In these tables the data consisted of the populations and deaths in cer¬ 
tain fixed age groups, wliich could not be altered since tabulations by 
single years were not available. Under sucli circumstances, and also when 
the data by single years have been thrown into appropriate age groups on 
the principles of pars. 53-54, the data consist in effect of values of popula¬ 
tions (r* — 2 x 4 .») and deatlis (/^ — where x is the first age of the 
group and n its range, and and /, denote in the actual community the 
I)opulation and deaths at age x and beyond in the same manner that 2 x 
and lx arc used in tlie life-table community. Summing from the bottom 
upwards, therefore, we get 7' and for the points of divisions of the 
groups. Thus when the original age groups were quinquennial to 15, 
thence decennial to 95, with a final group at 95 and over, as in the J^nglish 
Life Tables to No. 6 inclusive, the values so obtained were for ages 0, 5, 
10, 15, 25, 35, . . . , 95. The values at individual ages may then be ob¬ 
tained by inteqxilation, and the first differences taken negatively give 
values of the ix^pulation L' and deaths iC of the actual community in each 
year of age, from which nh follows by dividing the deatlis by the |K)pula- 
tion as in par. 95, and hence px or qx as usual. 

Numerous variations in detail have been made in applying these prin¬ 
ciples. In English Life Table No. 5 the values of (27; + O and {2Tx- O 
were calculated for the points of division, and the similar values at ages 
one year older were obtained by inteqxilation; then (27.,; + dx) and 
(2Z,; — dx), and hence 



were found by differencing at the points of division, and from these values 
the series px was completed by interfxilation. In the London Life 'fable 
(2/,; + j;) and (2L; — (/;) were obtained at each age from interpolations 
of log (22 X + /;) and log (27; — /;), while in English Life No. 6 the inter¬ 
polations were based on log (22,) and log /;. Other appropriate functions, 
such as 7 ; and /; directly, log qx, or log {qx + .1) may be used as in the 
later developments of these methods described in jiar. 99. 'fhe ratios of 
Tx and of /; to the corresponding 'Px and /, values of a smoothly graduated 
life table, or functions such as 

io«r; O'- 

also provide a valuable method of performing such graduations with refer¬ 
ence to a standard table (see A. Henry, J.I.A., XLVII, 407; Henderson’s 



Construction of Mortality Tables from Population Statistics hU 

“Mortality Laws and Statistics,” p. 60; “The Fundamental Principles of 
Mathematical Statistics,” pp. 85 86; and par. Ill here). 

In all these tables, moreover, an im|H)rtanl innovation was the employ¬ 
ment of a double process of interpolation (by c(»nstant fourth differences), 
and the blending of the results by the ( 'urre of Sines with the object of 
smoothing the breaks of continuity which occur at the points of division 
when ordinary interpolations arc made from separate abutting series. 
(The importance of smoothing the discontinuities which thus occur at tlie 
points of junction with ordinary inteq-Hilalions, resulting from the fact 
that each such interpolation segment “cuts its successor at an angle” [cf. 
J. E. Kerrich’s description, J.I.A., LX\T, S8|, may be seen readily from 
the diagrams on i>. 22a of the monograph “Elements of Graduation” by 
M. D. Miller et al.-- published by the Actuarial Society of America and 
the American Institute of Actuaries, 1946). By this method, for example, 
“at ages 26 -34, inclusive, tluTc was one set of values derived from the 
group with central age 25, and another from the group with its centre at 
age 35. A mean of the two values at eacli age was formed by multiplying 
the first .set by the following factors: 

(1) .97553 (A) .6.5451 (7) .20611 

(2) .90-151 (5) ..50(KX) (8) .095*9 

(3) .79389 (6) .34.549 (9) .02147 

and the second by these factors reversed, and by adding together the two 
products at each age. By this means the greatest weight was given to 
these terms nearest the centre of a group, and the least to those farthest 
from it. The factors are emjiirical, and are derived from the ‘curve of 
sines’; they are the numerical values (»f the expression 



when X is given successive integral values from 1 to 9” (see 65lli Annual 
Rejxirt of the Registrar-General, Part 1, p. xvii). 'flie method produced 
results which George King desc ribed al that time as being “exceedingly 
well graduated.” Tt is of interest to note, as shown in J.I.A., XLII, by 
Lidstone (p. 284) and J. Buchanan (p. 382), that Sprague’s liflh differenc'e 
osculatory formula (sec j)ar. UK)) uses, in effect, a blend of two cmlinary 
interpolations in proportions very similar to tho.se i>f tlic “i*urve of .sines.”* 

{iv) Tangential and Osculatory Reproducing Tnlcrpolation Pormulae on 
SPrague*s Principles 

* Other “overlapping** methods of finlinary inlerpoIalio»j luay also frwiurntly give 
good results—see J.I.A., I-, 126. 



134 


Populalion Statistics and Their Compilation 

99. On account of the empirical and companilively laborious nature of 
the “curve of sines” process, George King next suggested in 
XLTI, 238, that tangential or osculatory reproducing inter|X)lalion* 
could be applied more readily; and in that paper he reconstructed the 
Carlisle table and the Knglish Life Tables Nos. 3 and 6 (males) by the 
use of Sprague’s fifth difference osculatory formula (see par. ItX) here) 
and also by Karup’s third difference tangential formula (which King 
produced independently—par. 101). 

In these reconstructions a preliminary calculation was lirst made of 
the values of and for quinquennial ages throughout instead of for 
the decennial ages 15, 25, 35, etc., as in the original data the ordinary 
formulae for bisection of groups being used. 11ic interpolations were then 
performed, for purposes of investigation, in four different ways, namely, 
(A) By applying Sprague’s fifth difference osculatory formula to find 7’' 
and /i, and hence, by differencing, l/j, and at each age; (B) By using log 
T' and log /' as the bjisis of the fifth difference osculatory interpolation; 
(C) By finding quinquennial values of log t/x by ordinary' interprilation 
from the quincjuennial values of and and thence log ijr at each age 
by the fifth difference osculatory method; and (D) By ])rocee(ling as in C, 
exrc[)t that the Karup-King third difference tangential formula was 
employed. 

'Phis principle of using tangential or oscillatory interpolation as thus 
suggested by King was adojited in a number of instances. In the United 

*Tlic lonn osculatory following Sprague’s clrsignation of his original process 
(par. I(X)), and its .subsequent wide use to drscrihe also ihc Karup-King forniula 
(par. 101) and others—has been employed generally in actuarial literature to cover 
all those formulae for which one or more derivatives of adjacent curve segments are 
cciual at the points of junction. Strict matheinatical terminology, however, suggests 
that this label is applied correctly only when ecjuality of at least the lirst and second 
derivatives is required, so that, conforniahly, the word tangential (as suggested by 
II. S. Beers, T.A.S.A., XLVI, 83 81), might be us(‘d for those formulae which pro¬ 
duce equal lirst derivatives only. These two clKstinclive terms are therefore adopted 
here. 

The designations (<i) reproducing and (6) non reproducing are also used, as ex])laincd 
further in the text, in order to maintain a clear distinctifm between id) the tangential 
or osculatory formulae which (on Sprague’s *)r Ilemlerson’s princijdcs) are true inter¬ 
polation formulae reproducing the given values, and (ft) the tangential or osculatory 
formulae which (on Jenkins’ principles) do not rei)roduce the given vahus and conse¬ 
quently introduce into the tirocedures an element of graduation or smoothing. ('I'he 
term “non-reproducing” seems preferable to the word “rnodilie*!” which, following 
Jenkins’ original paper, has been used frerjuently to describe these non-re[)roducing 
formulae.) 



Construction of ^Tortality Tables from Population Statistics 135 

Stales the fifth difference formula was applied in 1910 with King's Con¬ 
struction A by J. W". Olover in i)rcparing a mortality table for the Regis¬ 
tration States (J.A.S.A., XII, 85); H. L. Rietz and C. II. Forsyth in 1911 
employed it for the jirodiiction of a rural U.S. life table (R.A.T.A., I, 9); 
C. II. F(>rsyth adopted it again in 1914 in constructing tiibles for the 
registration area for l‘)01-10 (J.A.S.A., XIV, 228); and Construction A 
(with slight variations) was used extensively by Glover in the U.S. Life 
Tables 1890, UX)1, 1910, aiifl 1901 10. The third difference formula, with 
which King cx[)erimentcd in Construction D, has not been so used; it has, 
liowever, been employed recently by Greville for the special intcr|X)Iation 
problems between ages 12 and 27 in all the U.S. life tables of 1939 41 
(see p. 126 thereof), and it has formed an important [)art of King’s later 
development of liis “pivotal value” method which is described in i)ar. 120 
here. 

100. Spraguc\^ fifth difference osculatory formula secures smoothness at 
the [)oints occu|)ied by the original data by providing that the curve of 
the fifth degree passing through the central interval Mo to Ms in the given 
six-point series Mu, Mi, Mo, m», Mt, and m& shall have, at the point whose 
onlinate is Ms, the same tangent and radius of curvature (i.e., the same 
first and second differential coefficients, being second order contact) as the 
partial Newton-Stirling curve of the fourth order through m« . . . Mi, and 
shall similarly, at the point whose ordinate is Ms, have the same tangent 
and ratlins of cur\’ature as the partial curve of the hnirth order through 
Ml. . . Mft. Tn this yirocess the other imyiortant condition laid down is that 
the fiirmula must reproduce the given values exactly, i.e., it shall be a 
reproducing interpolation formula only (cf. par. 107 where the subsequent 
develojiment of n«)n-re|)rodiicing osculatory formulae is discussed). The 
derivation of the formula, which was introduced in 1880 by T. B. Sprague 
(J.I.A., XXII, 270), is shown conveniently by King in J.T.A., XLIT, 239; 
anti an alternative tlemonstratitin has been given by G. J. Lidstone in 
J.I.A., XLIl, 394, which is assisted by the diagram included with this 
F)roof in the U.S. Life 'Fables, 18W, etc., p. 345.* 

'Fhe formula may be staleil in a number of ways: 

(a) Tn terms of onlinary advancing differences when derived as above 

* AIiIiciukIi (isrulalory rcprodiicini' fDrmuIau bcvDiid Spr.'ij'ue’s 5lh dilTcrfnccs and 
2nil order contact arc not usually required for practical j)iir|wses, it may be noted 
that J. F. Reilly fin R.A.f.A., xill, 4, and XIV, 12) generalized Spraf^ue’s and I.id- 
slone’s proofs for tlilTcrenccs of odtl onler 2/i 1 1 ami contact of order and showed in 
]>articular the r.\pressi«)ns wilh 5th ililTerences and 3rd order contact, 7th differences and 
2nd order contact, and 7lh differences and 3rd order contact. 



136 Population Statistics and Ttmr Compilation 

for the intervjil from to it is (.v < 1) 

+.i.. I-i'-+ A'».+<'+ 

in which I lie lirst three terms nuiy be written otherwise as 
//„ 1- I- **' * 


In this form it is seen to difTcrfrom Hie usual interpolation formula only 
in the fifth dilTcrcnce term; ami it is ap])lied easily by calculating the lead¬ 
ing differences for subdivided intervals as shown in the papers already 
mentioned. 

(b) An alternative form given by Karup ("I'rans. 2nd International 
Actuarial Congress, p. 82) in which the differences are taken centrally is 


Wx - Wo l'.vAW(i + 


.vf.r- 1) , .rf.v2- n „ 

, A-n ,+ -A®// 1 


.V (.V- -- 1 ) (x • 
21 


2) (A* 

Ab/ 0-1-. 


l)(3.v--7) 


24 


- A-’«-2 


for which he gives a special working [irocess of considerable facility which 
is well illustrated by J. Buchanan in J.I.A., XLTI, 385-‘>2. 

(f) The last form may be expressed at once in W. F. Sheppard’s very 
convenient central difference notation, in which 

- //x|i — Wx-i and /lio’'z/x = J 

so that 

b?f[ - //i — Un(~ Soffit = «i — 2ll[)-\~ W..i(= A®W_i), 

(V.v: — .7o — .b/i 1 3//n - H i( —Ab/_i), 

6*«o — ih - 4//i f- 6.7 ii — \u-\ I // 2( ■■ A’// a), 


and so on, ami /x5mu = + 5/^) etc. (sec Buchanan, J.I.A., XLII, 

370 and 379, and Henderson, I'.A.S.A., XXII, 175); and when wTitten in 
“Everett’s form,” in which y = 1 — a* for symmetry, it becomes 


^ a(a 2-1) , .v3(.v-l)(5.r-7) ,, 

«, = X7ii H- - - 6'^Ui + - ^ -24-^ ^ "1 

, yfv“—1) , , fv — 1) (5v — 7) 

+ yit» + - ' j- « 3 - 7 f„-h’ ' - - -., 4 - -' - - 


(40) 


as given by Buchanan in J.I.A., XLII, 379 (wlierc f = y), and in Hender¬ 
son’s “Mortality Laws and Statistics,” p. 76 (formula 6) for interval /. 




Consfriiction of AfortalUy Tables from Population Statistics 137 

The working process in this form is very simple, as shown in XLII, 
390, and T.A.S.A., XXII, 194.* 

101. The Karup'King third difference tangential formida was deduced 
originally in 1898 by J. Karup in Trans. 2nd International Actuarial 
Congress, p. 83, and was subsequently derived independently in 1907 by 
King in J.I.A., XLT, 545. An alternative demonstration is given by 
Lidstonc in J.I.A., XLII, 394. The formula -which is again a “reproduc¬ 
ing” interpolation formula -provides that the third degree curve through 
the central interval Ui to U 2 of the four-point series Wo, Ui, W 2 i and th shall 
have at Mi and 1(2 the same tangents (first difTerential cocilieients) as the 
partial Newton-Stirling curves of the second degree through mo . ■ ■ Ma 
and Ml . . . «3 respectively. It may be expressed in the following forms: 

(a) In terms of ordinary advancing differences it is 

.V (.V 4-1) x^(x — 1) 

Ml f X = M, + .rAMo + — 2"—' + --2 A»Mo ; 

and in this form it is easily applied as shown by King in J.I.A., XLfT, 245, 
and Supiilement (Part 1) to the Registrar-GeneraPs 75th Rejwrt, p. 52. 

(b) Karup gives it in terms of dilferem cs taken centrally as 

Mj: “ Mo + .VAMo H- 2 ‘ “I" 9 -^ • 

(r) In KveretPs form (y — 1 — x) with SheppanPs notation it be¬ 
comes 

, 1) , , V“(y —1) .... 

M^ — .VM|-f- - " 2 "' ■ 5"Mi- f-VMii + j-O-Un (41) 

as shown by Buchanan, J.I.A., XIJI, 378, and also by Henderson in 
T.A.S.A., XXII, 189, and “Mortality Laws and Statistics,” p. 77 (for¬ 
mula 7) for interval /. The numerical work in this f(;rm is extremely con¬ 
venient (cf. J.I.A., XLIII, 159 60). 

102. The preceding “rejiroducing” formulae of pars. 100 and 101 are 
designed to secure smoothne.ss of junction of the interpolated results at 
the points occupied by the original data. Jl is also to be remembered, 

* Still usint; Sprague’s Ijasic assumptions except that tlir ililTercntial coefficients 
were fJcterinincd from the mean of their values in the partial curves, J. Huchanan in 
J.r.A., XLII, 374, reacheil a variation of Sprague’s formula of [lar. UM)(c) in which 
the coefficient of S*//i was 

.r*(jr 4) 

12 


and similarly for (See also T.F.A., XII, 124, and \V. A. Jenkins in K.A.T.A., XV, 
90, and T.A.S.A., XXVllf, 198.) 



138 


PopukUion Statistics and Their Compilation 


however, as stated in par. 95, that the problem here under consideration 
is ordinarily not merely one of smoothing, but is primarily that of re¬ 
distributing the approximately correct quinquennial (or decennial) 
groupings into the values at each age so that, while the irregularities at 
individual ages will be removed, the totals in each group will not be un¬ 
duly disturbed. 

In 1913, accordingly, S. T. Shovelton (J.I.A., XLVII, 284) investigated 
the effect of introducing directly the condition of reproduction of grouped 
data. Requiring that the interpolation curve shall be of the fourth degree 
and that it shall have the same ordinates and tangents at the point of 
junction as the partial curves based on five given values, he imposed an 
additional condition designed to effect the reproduction of grouped data. 
If the ungraduated values are grouped in fives and pivotal values (see 
par. 120 here) are then found by King’s formula (49), and if the formula 
in question is next applied to obtain intermediate graduated values, it 
was requirerl that the quinquennial sums of the graduated and ungradu¬ 
ated values shall agree to and including the fifth differences of these sums. 
The resulting expression thus became, in Kvereit’s form, Shovel ion's six- 
point tangential formula 




(42) 


which was shown to give a very satisfactory interjjolation (see also 
R.A.I.A., XXXIV, 57). 

Shovelton also gave an alternative derivation of the formula in which 
the condition of reproduction of grouped values is replaced by the re¬ 
quirement that the area under the iiiterjx)lation curve in each interval 
shall be equal to the average of the areas under the j^artial curves in the 
same interval. It was found, incidentally, that Sprague’s fifth difference 
formula satisfies the condition as to the area under the inter|)olation 
curve. In the case of a third degree formula based on five given values, 
Shovelton also showed that both his methods of derivation produce the 
Karup-King formula of par. 101. 

103. In determining the differential coefficients for the tangential or 
osculatoiy “reproducing” formulae of the preceding paragraphs, the par¬ 
tial curves are taken of one degree less than that of the final curve, 'fhis 
restriction, however, is not necessary; and it was consequently shown in 
im by R. Henderson (T.A.S.A., IX, 215-17) that Karup’s third differ¬ 
ence formula of par. 101 could be improved, without going as far as 




Construction of Mortality Tables from Population Statistics 


Sprague’s original fifth difference method, by determining a third differ¬ 
ence curve with its first differential coefficients equal to those of two |)ar- 
tial Newton-Stirling fourtli degree curves (instead of merely taking second 
difference partial curves as in Karup’s formula). Tlie resulting expression 
in Kverett’s form is Henderson's six-point tangential formula 


= xui+ ^ - 8hii+- 2--- 5^1/1 




(43) 




as given by him in his “Mortality I^ws and Statistics,” p. 77 (formula 8, 
for interval t) and used on p. 97 thereof. Tt may also be written as 


«,-». + » i«, + ' 'if: !>- ,B, + ' '■!: C, (44) 


where Bx = 5-//, — and Cx = b^ih — Ib^iix as in 'F.A.S.A., IX, 
215 17, and XXII, 195 (example No. 3). 

This formula materially reduces the labour of applying Sprague’s for¬ 
mula—for, as shown in T.A.S.A., IX, 216, the fourth and fifth differences 
which are retained are in effect applied “merely as corrections to the 
second and third orders resfx'ctively.” 

(r) Tangential and Osculatory Reproducing Interpolation Formulae on 
Henderson's Principles 

104. 'rhe formulae of pars. 1(K)-103 are all derived on Sprague’s original 
Iirinciple of equating certain functions of the curve to the corre.siwnding 
functions of two ordinary jiartial curves at a central interval, in addition 
to the requirement that the given values must !)e reproduced. The selec¬ 
tion of these partial curv'cs is, of course (as remarked by Jenkins in 
T.A.S.A., XXVIIT, 198), somewhat arbitrary. Henderson accordingly in 
1921 enunciated the new principle that it is preferable to discard the use 
of partial curves, and instead simply to imix)se the condition that the suc¬ 
cessive intervals are to be filled in by curves of the sixeified degree with 
their constants so determined that the corresfjonding differential coeffi¬ 
cients at the points of junction shall be equal to each other (cf. R.A.T.A., 
XlIT, 23). lie retained, however, Sprague’s other basic requirement that 
the two curves must exactly reproduce the given values. Following this 
principle, therefore, we may take formula (44), which is a general expres¬ 
sion for a function of the third degree between z«o and z/i, and “determine 
and C\ from the conditions that the values of the first and second 
differential coefficients should be continuous at the |X>ints of junction, if 
differences of the sixth and higher orders vanish.” We thus obtain easily, 



140 


Population Statistics and Their Compilation 

as in T.A.S.A., XXII, 190, Cx = and B^ + \»B^ = ft#,. This last 
difference equation gives 71, = 6-(l + = ft/x — \hhix approxi¬ 

mately, by neglecting differences of the sixth and higher orders. Also 
C* = Ilcndcrsoiis approximately oscillatory formula 

is consccpiently 

u, = A «, + ® _ J S*Ui) 

, t ( 45 ) 

+ y„„ + y ( «»«0 - i 5^««) . 

The numerical work is very convenient, as shown in T.A.S.A., IX, 219. 

The discontinuity in the first differential coefficient, which results 
from the relation 71, = o-//, — being only an approximate solution 
of the difference equation 71, + 45^7, = 5“//x, is (as is shown easily in 
'r.A.S.A., XXTI, 186; see also T.A.S.A., IX, 221, and R.A.T.A., XV, 88). 
The formula is now mainly of historical interest as liaving occupied a 
prominent place in Henderson’s departure from the basic principles 
originally used by Sprague. 

105. In view of the discontinuity of which is thus involved in 
formula (41), Henderson subsequently showed (sec R.A.T.A., XIH, 24, 
following 'r.A.S.A., XXV, 30) that an exact solution of the difference 
equation Ti, + = Srih can be obtained by writing it in the form 

so that 

(T+V)2 « 

and consequently r = 2 — \/3 = with sullicicnt accuracy, w’here r is 
ai)plied by means of the relations 77^ = 6*Wx + r(o-/#x — 77^i) and 
Bx = Bx + r{Bx — 77x^i). 'two arbitrary initial values are necessary for 
Bx and B, in these relations, 'rhey may be determined so that tlie first and 
last second differences of /I, will vanish, which from the difference equa¬ 
tion at once gives S^Ux as the initial value of Bx. For Bx “we start with an 
approximate value and work as if it were correct; then to the values of Bx 
so derived we apply a set of corrections k, —rife, r®^, — r®ife where k is so 
determined that the corrected values of Bx will satisfy the required condi¬ 
tions.” The following example gives the actual work of deriving corrected 
values of 77„ using for r (see R.A.I.zV., XTTT, 25; note also Henderson’s 
numerical comparisons in T.A.S.A., XXXV, 280, based on Grant’s 
Canadian data, which there showed a better sum of 3rd differences re- 



Coustruction of Mortality Tables from Population Statistics 141 

garcllcss of sign tlian Ihc Karup-King 3rd difference formula, but a result 
not as good as Jenkins’ Sth difference non-reproducing formula (47)). 


X 

Mx 

.In, 



5. 

2t.8W 

- .404 


-.311 

10. . .. 

21.405 

- .60«; 

-.205 

-.W) 

15 .... 

20.886 

- .612 

-.033 

+ .003 

20. 

20.214 

— .755 

-.113 

-.114 

25. 

1«).486 

- .877 

-.122 

--. 116 

50. 

18.612 


-.131 

-. 1.45 

.kS. 

17.fi04 

-1.140 

-. 132 

-.131 

•10. 

16.461 

-1.2()*) 

-.129 

•.P8 

45. 

15.105 

-l..kH5 

-.116 

-.113 

50. 

13.810 

-1.476 

- 001 

- .085 

55. 

12..kU 

-1..S.12 

-.056 

- .018 

60. 

10.802 

-1.541 

-.012 

-.tK)2 

65. 

9.2.S8 

-1 506 

+ .087 
+ .132 

+ .049 
+ .097 
+ .141 

I +.16.4 

i. 

70. 

1.1 SI 

-1.419 

75. 

6.3.vl 

-1.287 

80. 

5.016 

-1.129 

+ .158 

85. 

3.917 





('iirmtimi 


C«»rri*i ti*il 
Os 


Uiir«»r- 
rwU 1 
Ox 

-..^72 
-.226 
+ .0I.S 

-.111 
-. 1.^6 
-. 1.^1 
-.1.^0 
-.110 
-.002 
- A)S7 
-. 01 .^ 
^-.n.^o 
-h.0S7 
H-.12>6 
-h.l5S 

-f-.iso 


-.077 
+ .021 
-.(K)6 
|-.(M)1 


-.4-10 
-.205 
+ .0S0 
-.15? 
-.Ill 

- 156 

-. l.?0 
-.110 
0t)2 

- 057 
-. 01.1 
+ . 0.10 
+ .087 
I-.1.16 
+ . 1.^8 
+ .180 


'riic osculatory inteqxilation on the princ iple of par. 104 may now be 
performed exactly, instead of appro.ximatcdy as in formula (45), by using 
these ci^rrec'ted values of as if they were the actual second differences 
in Kverett’s ordinary central difference formula 


V (x'^ — 1 *) V ( V“ — I ) 

Ux^xu,+— ^ --- 5=//i + y//o + ^ • 5‘/6.. 


106. The production of a set of exact tangential or osculatory reproduc¬ 
ing formulae on Henderson’s firinciples of par. 104 namely, that the 
(:orre.spondijig differential coellicients of eac'h curve at the common point 
must be ecjual to each other (but not necessarily ecjual to any |)redeler- 
inincd values as laid down by Sprague), and that the given values must 
be reproduced-- was investigated next by W. A. Jenkins in 1926 (K..\. 

XV, 87 and 191). The third difference tangential formula emerging on 
these assumptions was, interestingly enough, the Karup-King formula 
(which, as stated in par. 101, was derived originally by Simiguc’s method 
of using values for the first differential coctlicients predetermined from 
partial curves). For the fifth difference case with 2nd order contact,* 

* The cunrlilion that the given values isiiist l>e repn)duce<l, i.c., that the two 
curves must take the given value at the common point, was met hy tiikiiig ^(a*) - 
Jf(.« — l)^(.v), and ^(.v) was assumed to he a ])olynomial. Then ^(a-) — tio -{- ^/i.c I- 
gives Jenkins* formula alxivc; ^(jf) - uci h Oix + -| (which is redundant to 



















142 Population Statistics and Their Compilation 


however, Jenkins* fifth difference osculatory reproducing formula was found 
to be 




(46) 


(vi) Tangential and Osculatory Non-reproducing Interpolation Formulae 
on Jenkins* Principles 

107.* All the preceding formulae derived by Sprague’s assumptions 
which use partial curves, or from Henderson’s which discard the partial 
curves and simply require that the differential coefficients shall be con¬ 
tinuous, have in common the other basic condition that (being ‘ ^reproduc¬ 
ing” formulae for interpolation) the two curves must take the given value 
at the common point. When such formulae have been used, as in many of 
the population tables, to fill in the values between certain predetermined 
points, it has been found—unless the values at those points themselves lie 
upon a smooth curve—that the whole curve which finally results will 
show many undulations and points of inflexion, even though it will be free 
from discontinuities (cf. J. Buchanan, T.F.A., XII, 124). In order to 
meet this weakness W. A. Jenkins in 1927 (T.A.S.A., XXVIII, 198) 
therefore released the two curves from the requirement that they must 
take the given value at the common point, and instead permitted, in 
effect, that the interpolated value shall differ from the given value by a 
fraction of its 2nd difference in the 3rd difference formula, or of its 4th 
difference in the 5th difference formula. Since the predetermined p)oints 
are thus not to be reproduced, the resulting non-reproducing formulae 
(which earlier were called “modified” formulae) will evidently effect some 
adjustment of those values in addition to performing a tangential or 
osculatory interpolation for the intermediate values. Jenkins gave a 
general expression, and the 3rd and 5th difference formulae, in T.A.S.A., 


the extent of one degree in x since the term asx* is not necessary) produces Sprague’s 
formula (40) when — 2 \, and Buchanan’s (see footnote to par. t(X)) when ui = — I; 
and ^(x) = ao + aix gives Henderson’s (45). The proofs used by Jenkins are shown 
clearly in his papers (loc. cit.) —on which Greville’s remarks in T.A.S.A., XLV, 217-18 
should now be noted. 

* The wording of this paragraph and some later paragraphs is taken, with slight 
variations, from **The Fundamental Principles of Mathematical Statistics.” Section 
XI, pp. 119-48, of that volume, which gives ”An Outline of a Course in Gradua¬ 
tion” (covering the principal methods proposed up to 1942), and the accompanying 
technical discussions and references elsewhere therein, should Ije consulted for a more 
complete and detailed presentation of the basic theory and the practice of graduation 
than is necessary in the condensed treatment appropriate for this Study. 



Construction of Mortality Tables from Population Statistics 143 

XXVin, 199-206, and the general form for even orders of differences, 
with the 4th difference formula, in T.A.S.A., XXXI, 10-12 and 24-^30; 
the proof of the Sth difference formula is also given conveniently in T.F.A., 
XII, 138 (on which the comment by Reid and Dow in T.F.A., XIV, 189, 
should be noted), and in Miller’s “Elements of Graduation,” p. 70, while 
W. G. P. Lindsay produced an elegant alternative demonstration in 
T.F.A., XIV, 211. 

Jenkins' fifth dijference osadatory non-reproducing formula which he 
thus reached is 


«* = 


, x{x*- 1 ) 
__ 


52II1 


(47) 


The correction to each given value amounts to ^ of the negative of 
the 4th central difference, as may be seen by putting :i: = 0 (and y = 1) 
in (47). In practice it is consequently important to choose the formula 
which extends into an order of differences alternating in sign, for other¬ 
wise the graduated series will lie evei 3 rwhere above or below the observed 
|X)ints; in mortality experiences, therefore, where the 2nd differences are 
usually not negligible while the 4th differences change in sign frequently, 
it was pointed out by Jenkins that the 4th or Sth difference formulae are 
suitable but that the 3rd difference expression is unsatisfactory. 

This formula is appropriate (subject to the nature of the differences 
just mentioned), and has been used frequently (for e.xample, above age 32 
in the U.S, Life Tables and Actuarial Tables, 1939-41, p. 125) when the 
given values (or the “pivotal” values -see pars. 129-24) are not suffi¬ 
ciently smooth to serve without some further adjustment as the basis of 
reproducing osculatoiy interpolation. 

(r/t) The General Problem of Determining Tangential aptd Osadatory 
Formidae of the Reproducing and Non-reprodneing Types 

108. The various well-known reproducing and non-reproducing formu¬ 
lae of Sprague (with Reilly’s generalizations), Buchanan, Karup-King, 
Shovelton, Henderson, and Jenkins which have been stated in pars. 100- 
107 are, in reality, particular expressions resulting from the selection of 
certain special methods of giving effect to the basic conditions on which 
they were derived. The first paper which directed attention to one aspect 
of this fact was that by A. R. Reid and J. B. Dow in 1933 (T.F.A., XIV, 
185); in 1935 a much wider discussion was given by J. E. Kerrich (J.I.A., 
LXVI, 88); and in 1944 T. N. E. Greville (T.A.S.A., XLV, 202) developed 



144 


PopidatioH Statistics and Their Compilation 


the general theory S 3 rstematically with the object of “presenting the sub¬ 
ject as an integrated whole rather than a collection of isolated formulas.”* 
109. In their paper Reid and Dow |X)inted out that Jenkins’ Sth differ¬ 
ence non-reproducing formula (47) is only one of many satisfying the 
prescribed conditions, since one arbitrary constant is avoided by the re¬ 
striction implicit in Jenkins’ derivation that no differences beyond the 
Sth are to appear in the linal formula. Relaxing this limitation, therefore, 
Reid and Dow*s general fifth difference oscillatory non-reproducing formula 
emerged as 


«x = .v«, + + (jje - + J bx» 

+ yuu + ^ • fi*«o + (fty — B 'ito +1 5'mo 


(48) 


where b is an arbitrary constant (giving Jenkins* (47) when b = 0). Writ¬ 
ing ui for Jenkins* (47), the relation between it and (48) is 


Reid and Dow consequently suggested that in practice ni should first be 
calculated, and that the flexibility affonled by the b term should be used 
to improve the results as may be desirable—the procedures which they 
actually ado[)tcd being noted in par. 123 hereafter since they arc of a type 
belonging to the methods there discussed. 

110 . Kcrrich’s trcjitment gave a more general approach uniler both 
Sprague’s and Jenkins’ conditions, and produced expressions from which 
the formulae of Reid and Dow, as well as those of Sprague, Buchanan, 
Karup-King, Jenkins, and t)thers arc obtainable immeiliatcly as siwcial 
cases, t 


* AltlmiiKli Ihci mathematical techniques employed lie beyonri the scope of this 
Stufly, research workers should note also that I. J. SchoenherR has Riven (in 194(M-iS) an 
elegant treatment of graduation and inteqjolation exprcssilile in linear compound 
form (see fur)tnotc to [lar. 12d here) in his papers “Onlribiilions to the Problem of 
Approximation of Iviuidistant Data by Analytic Functions” (t)uarterly of Applied 
Mathematics, IV, 45 aiul 112) and “Some Analytical Asi)ecls of the Problem of Smooth¬ 
ing” (Cuurant Anniversary Volume, 1948, 351). Using characteristic functions and 
Fourier integrals, he derives the basic functions of the well-known interpolation formu¬ 
lae in remarkably compact form, of which a brief summarv is stated by T. X. K. 
Grcvillc in “Recent Developments in Onuluation and Interpolation” (J.A.S.A., XLTII, 
434). In 1952 Schoenberg extended his approach still further in an address “Dn SiiKwth- 
ing Operations and Their Generating Functions” (nulletin of the American Mathe¬ 
matical Society, I.fX, 199). 

t Kerrich also pointed out the similarity between osculatory interpolation and the 
method of “pseudo-analytical” graduation which (following the idea of Felix Klein) 
has been presented by H. C. Nybdlle (Nordic Statistical Journal, I, 103, and J.I.A., 



Construction of Mortality Tables from Population Statistics 145 

111. Greville’s paper dealt with the theory of tangential or oscillatory 
interpolation with equal intervals by developing the most general expres¬ 
sion for any order of contact, and showed how to derive a tangential or 
osculatory formula satisfying predetermined requirements as to continu¬ 
ity of derivatives, reproduction or non-reproduction of the given values, 
number of terms, correctness to a stated order of diflerenccs, and the 
values of any arbitrary parameters not fixed by those conditions. (For 
example, the Karup-King formula (41) herein is a tangential, reproducing, 
four-i)oint formula, correct to second dilTcrences, and using third degree 
curves; it can be derived by S|)ecifying these properties, and is the only 
formula possessing tliem). An unlimited number of such formulae can 
thus be written down, amongst which are all the i)artii ular formulae pre¬ 
viously known (pars. 1(K)--1()7 here), as w(‘ll as a number of other useful 
expressions which had not been given before (and some of no practical 
value on account of the very general nature of the conditions imposed). 
'Hie freedom of choice is thus so wide that, as Greville showed with nu¬ 
merical examples,* a formula can in fact be determined which will in- 
coqxirate in a single oi)eralioii the desired amount of non-reproiluction 
with the lyi)e of interpolation sclecteil. 

Among the various formulae which he j)ublished, Greville directed at- 

LXVr, aiul further examined by J. F SlelTenscMi (Akluarsku Vuly, of which 
an Kn};lisli abstract |»rcparcd by Kcrrich appears in J.L.A., LXVI, 125). 'Fhc objectives 
Ilf ‘‘[iseudo analytical” graduation arc thus di'srribed in Kerrich’s paper Hoc. t//., 92): 
“'Hie result aimed at is to pjiss a smooth curve among the observations in such a way 
that any ordinate can readily be calcuhated, and sucli that the curve jiossisscs a pre¬ 
assigned numlier of continuous successive derivatives whose values ran also be reailily 
calculated at all points on t ho curve. 'I’his curve and its derived curves are then i'(‘garded 
as delining the underlying function sought and its successive derivatives.” Accordingly, 
a set of apjiroxiniatc derivatives is tirst c.alculatetl from the obsiTved valiic.i by some 
Iiraeticable method such as apiiroximate formulae for derivatives in terms of observa¬ 
tions, with .subsequent gratihical snioolhing; if the graduating curve is to have, for 
e.xample, continuous lirst and second deriv.'ilives (because to go beyond second deriva 
tives would usually involve very heavy numerical work), the calculated second deriva¬ 
tives are then assumed to form a iiolygonal arc of straight line segfoents; “these linear 
segments are integrated interval by interval, and the constants of integration adjusted 
so that e.ach integrated element joins the next”; and linally tliesi* parabolic elements are 
integrated similarly, so that the resulting graduating curve is built of a set of osi'iilating 
cubic arcs. 

* fn an atijilieation to somewhat intractable material, he first used the operator 
1 which (see If. H. Wolfenden, T.A.S.A., Xl-Vf. 97, and XXVI, UX)-- 

II).?, and “The Fundamental Principles of Mathematical Statistics,” l.?2 57 ami 2S2 St) 
is I)c Forest’s “best /io” 7-term “lilting” formula »o ■= «M7//o 1- b// Li + f ;2 - lu+A 

which minimizes the mean square error in w, and then tried two specially detcrmineil 
“best /io” operators employing 2nd and 5th difference coriections. 



146 Population Statistics and Their Compilation 

tention to 15 new tangential formulae (with first order contact) and 10 
new osculatory formulae (with 2nd order contact), and also to 7 (obtained 
by giving particular values to certain arbitrary parameters in the first 25) 
which were specially designed to incorporate the best or nearly best i?o 
fitting. The results were presented in condensed tables, from which the 
Everett forms can be written down at once. As examples—but without 
implying their invariable superiority by their statement here, where they 
are shown as illustrations only—one of the new tangential formulae 
(Greville’s (69), Table I) is 

= Mi + (4* - 7) 

- Hy* - i) - *3^ (4y - 7) 5»iio 

and one of his osculatory non-reproducing expressions (his (88), Table 11) 
is 

«. = Ml + la (Jt* -i)Sui + ^ (x* - i) (x* - |) 

+ ain** (** ~ W* + V) *‘“»1 

- li (y* - i) «Mo + ,^1 (J-* - J) -1) «>Mo 

+ (y* - Wy + ¥) 

while among the best or nearly best Ro expressions the formula actually 
used in his numerical illustration (No. 109, Table IVB) is 

Ux — XUi + (x + §) + J**5 *Mi 

— ywo — ly (y + |) Puo — Jy®6^Mo 

(viu) Methods of Calculation^ and Treatment of the End Terms, with Tapi- 
gential and Osculatory Interpolation Formulae; and the Unequal 
Internal Case 

112. As noted in the preceding paragraphs (see also U. Freeman’s 
“Mathematics for Actuarial Students,” 11, 73, and Grcville’s comments 
in R.A.I.A., XXXll, 86), the numerical work involved in calculating the 
differences and using the Everett types of these interpolation formulae is 
generally quite convenient. An alternative method which now is usually 
preferable, however, because it eliminates the calculation of the differ¬ 
ences (except for the end values) and can be applied rapidly with modern 
calculating machines, is to express and employ the formulae in terms of 
the given s«’s in linear compound form (see par. 123), by means of the 
relations stated in par. 100(6)—see, for examples, John Boyer’s paper 
“Osculatory Inteipolation in Practice” (R.A.I.A., XXXI, 337) and 
M. D. Miller’s monograph “Elements of Graduation,” pp. 25-26. 



Construction of Mortality Tables from Population Statistics 147 

113. The values at the two ends which arc not reached by a sym- 
metrical interpolation formula can be handled by one of the following 
methods: (1) Employing comparable unsymmctrical formulae to find the 
interpolated values at the ends; (2) extending the interpolated values by 
assuming that the missing values of the last order of differences arc con¬ 
stant (and that the next therefore vanish), completing the difference 
table by addition, and applying the symmetrical formula; (3) making a 
hypothetical extension of the scries of given values, from which the re¬ 
maining interpolations are j^erformed by using the symmetrical formula; 
(4) continuing the interpolated values, as computed in the last interval 
dealt with by the usual application of the symmetrical formula, by assum¬ 
ing constant differences; (5) using special interpolation curves determined 
to satisfy specified conditions and joining smoothly with the inteqxilated 
values resulting from the symmetrical formula. 

Method (1) is illustrated by formulae such as those given by Jenkins 
for No. (47) here in respect of the two intervals at each end (sec T.A.S.A., 
XXVlll, 2()9, and Buchanan, T.F.A., Xll, 135 and 140). Method (2) is 
discussed by G. J. Lidstone in 'P.F.A., XTT, 277 (“Note on the Computa¬ 
tion of Terminal Values in Graduation by Jenkins’ Modified Osculatory 
Formula”), where it is pointed out that the formulae of method (1) are 
tacitly based on certain artificial values of the missing differences, and 
that the same result is obtained by inserting 0 in the places of the missing 
values of preceding the first value of d*7i yielded by the data, and simi¬ 
larly at the other end. Buchanan, how'ever, objected in T.F.A., XIV, 209, 
that this method in effect assigns definite magnitudes to the missing m’s, 
and that the resulting curve consequently may be distorted; he therefore 
advocated the use of reasonable hypothetical values as in method (3). 
Method (4) was suggested and used by Greville in T.A.S.A., XLV, 239. 
The special processes of method (5) are discussed in par. 128 here. 

In addition to the comments on particular methods just noted, refer¬ 
ence may be made to the papers by Boyer in R.A.I.A., XXXI, 337, and 
Greville in T.A.S.A., XLV, 237. 

114. In practice the data from which the interpolations are to be made 
can usually be obtained at equal intervals. It may be noted, however, that 
the unequal interval case for reproducing formulae on Sprague’s prin¬ 
ciples (par. 99-101 here) has been examined by T. G. Ackland in J.I.A., 
XLIX, 369, and by J. F. Reilly (in its general form) in R.A.I.A., XV, 34. 

(ix) Reproducing and Non-reproducing Interpolation Minimizing the Mean 
Square Error in a Specified Order of Differences 

115. Although the tangential and osculatory formulae of the preceding 
paragraphs are based on the principles of interpolation, they deal, in fact, 



148 Population Statistics and Their Compilation 

with the problem of graduation by constructing a continuous curve with 
one or more of its derivatives satisfying sj^ccificd conditions, and in such 
a way that any interiv)latcd value and the values of the derivatives arc 
all viewed as imj^rtant.* It may be argued, however, that actually the 
values of mortality table functions arc seldom required for other than 
integral ages, and that consequently it would be more pnicticablc, and no 
less logical, to regard such a function as a discrete series of numbers 
rather than to think of it in terms of a continuous curve. 

116. In order to approach the problem in this manner, the theory of 
minimizing the mean square error of a linear comiwund, as it was first 
stated in actuarial literature by 1C. L. Dc Forest, can be ap])licd readily. 
l)e Forest’s basic assumi)tion (sec H. H. Wolfenden, “On the Develop¬ 
ment of I'ormulac for Graduation by Linear Comix)unding, with S|)ccial 
Reference to the Work of TCrastus L. T)e Forest,” T. \.S.:>., XXVI, 81, 
and “The Fundamental Princli)lcs of Mathematical Statistics,” pp. 182 
34 and 283 84) was that, if Ur is an observed value and Ur the true value 
so that Ur = Ur + Cr wlicrc er is the error, it may be assumed that the c’s 
are independent and that the mean sc|uare of each is (say) cjr, to use 
other language, it may be assumed that the errors in the observed values 
arc independent random variables with mean zero and variance (?. If it is 
further suptwsed that differences of U beyond order j are zero, it follows 
thatA"'!/, = A"'erforw> / In his derivations of linear comixnind gradu¬ 
ation formulae of the type of No. (56) here, De Forest adopted the logical 
|X)sition tliat theoretically the smoothest results would be firoduced by 
minimizing the mean square error in A^"^ h- (rather than in as employed 
by many subsequent writers) -this use of instead of A^ having been 
supi)ortcd also by Wolfenden in his [miwr on Dc I'orest’s work {op. at., 
p|i. 109-10; and R.A.T.A., XXXVIT, 33) and later by T. N. K. Grevillc 
fin R.A.r.A., XXXIV, 39, and XXXVT, 259). The matliematical tech¬ 
nique for thus minimizing the mean square error in v (to give the best 
fitting formulae -such as No. (61) of jxir. 123 here), or in A'‘h» (for 
maximum smoothness), is cxplaiiic<l fully in the first references given in 
this paragraph, and is also shown by Grevillc in R.A.T.A., XXXIV, 22-23 
and 30. In those explanations, and in the further developments to be now 
discussed, it is important to recall that (a) the assumption that differ¬ 
ences of U beyond order j are zero means that the true A^^I/, is zero 
but the observed is not zero; {b) as shown in the “The Funda- 

* 'Hiis vic\v[)oItit is regarded even more emphatically in the somewhat analogous 
“pseudo-analytical** method (see footnote to par. 110), for there the avowed objective 
is to produce first from the data a smooth scries of derivatives and thence to return to 
the original function by successive integrations (cf. J.I.A., LXVI, 68 et seq.). 




Construction of Mortatity Tables from Population Statistics 149 

mental Principles of Mathematical Statistics ” pp. 23-25 (formula (27) 
with = al = . . . = ol — (?) the mean scjuare error of a linear com¬ 
pound hF\ -h -f InFn of any number of independent observed 

qualities Fi, F 2 , . . . , Fn each obeying the Normal Curve of Krror with 
mean square errors cMs (/? + /I + • - • + (c) when a graduated 

value V is determined by linear compounding from a range of observed 
w’s as in (56) here, the mean square error in v itself is therefore (2)/;)c-; 
(d) similarly, when v again is to be found as a linear compound of ids as in 
(56), the mean square error in w’hcn j = 4 (as shown in T.A.S.A., 
XXVT, 110, for the case of A'l* when j = 3) is 2:(A'V.)V; (c) it is thus 
appKircnt that when, o!i I)c Forest’s assumptions, we minimize the mean 
s(iuarc error in the (j + l)th differences we are really to give two al¬ 
ternative statements—minimizing tlie sum of the S(iuarcs of the (/ + l)th 
differences, or (as stated by (Jrevillc in T.S.A., I, 356) we arc minimizing 
the sum of the squares of the coefficients which express the (j + l)th 
differences of the intcri^ilated values in terms of tlie given values them¬ 
selves. 

(/) These assumptions and procedures, as they have been described in 
the publications to which reference has been made, are concerned basically 
with graduation by linear com|K)unding of the observed ids. 'fhe same 
technique obviously can be eini)loyed in the allied process of interiwlating 
at certain points from a set of observed ids the problem usually en¬ 
countered in dealing with population statistics being the interpolation of 
the intermediate values from data at quin(]uennial yxnnts. Tn 1945, there¬ 
fore, Greville (in R.A.l.A,, XXXiV, 22 -33) adojUed De Forest’s assum|i- 
tions, and the principles stated in the preceding paragraph, in order to 
find the linear compounding coefficients for this problem of subdivision of 
given intervals, and gave the results for reproducing interpolations when 
the mean square error in the (j + l)th differences is to be minimized with 
j = 1, 2, 3, and 4 (his Tables 1, 3, 4, and 6 ). The detailed proof for the 
case of 7 — 4 (Greville’s Table 6 , and Table T here) is shown in R.A.l.A., 
XXXIV, 27-33; and a shorter form of proof, which depends on the 
mathematical analogy between graduation and interpolation formulae 
pointed out by H. Vaughan in J.I.A., TAXTF, 482, is given by Greville for 
the case of j = 3 (Table 4, loc. cit., p. 26) in his paper “On the Derivation 
of Discrete InteriJolalion Formulas,” in T.S.A., I, 343. For the usual case 
of subdividing given intervals into fifths (thus using six terms), and taking 
y = 4 so that the mean square error in the 5th differences is minimized, 
the coefficients as deduced by Greville in R.A.l.A., XXXIV, 27 33 (and 
Table 6 on p. 28), are shown here in Table 1 (where the printed value 
— .1266 for Via when a: = 4 in R.A.I.A., XXXIV, 28, is corrected, as 



C/I Vi ts; 


150 


Population Statistics and Their Compilation 
TABLE 1 

Part A—To Be Used for the First Two Intervals 


CuKrnc'fKMTs of Mx To Obtain: 


X 


0 ... 

1 . . . 


+ . 676 .? 
+ .4489 
-.046(1 



- .0195 


-. 1«>66 
+ .1559 
-.0179 


-.2026 
+ .1749 
-.04.?.? 


+ .2221 
+ .9839 
-.1726 
-.1266 
+ .1249 
-.0317 


+ .0851 
+1.0529 

- .l.?46 

- .0446 
+ .0559 

- .0147 


Ti.l 

-.0420 
+ .848-1 
+ .2184 
-.00.S6 
-.0276 
+ .008-1 


-.0514 
+ .(x?14 
+ .4844 
-.0-176 
-. 021)6 
+ .0098 


- .0400 
+ ..?904 
+ .7424 
-.0896 
-.0096 
+ .(X)64 


+ .1679 
+ .9314 
-.0866 
+ .0049 
+ .0019 


Part B—I'o Be Used Except for the First Two 
AND liAST Two Intervals 


(V)F.FFIt'iF.NTS OF Ux To OUT.MN! 


J 

fni .s 

I'lif .4 

Vllt .4 

Vh I .R 

w-2. 

+ .0017 

+ .0136 

+ .0088 

+ .0027 

/)-!. 

-.0921 

-.1096 

-.0776 

-.a?ii 

11. 

+ .9234 

+ .7184 

+ .4464 

+ .1854 

»+l. 

+ .18‘54 

+ .4464 

+ .7184 

+ .9234 


-.a?ii 

-.0776 

-Am 

-.0921 

»+3. 

+ .0027 

+ .0088 

+ .0136 

+ .0117 

_ 

__ 

_ 

_ 

_ 


Part C- -To Be Used for the I.ast Two Intervals 


X 



riiKF«nF.NT.s iir Ux To Ohtmn: 



I'l-i.i 




Ff .4 

Ft .4 

Ft .4 

Ft- .2 

5-S. . 

+ .0019 

+ .0064 

+.oo‘;8 

+ .«)84 

- .0147 

-.a?i7 

-.0433 

-.a?79 

s-4. . 

+ .(X)49 

-.00*N) 

-.0266 

-.0276 

+ .0559 

+ .1210 

+ .1749 

+ .1559 

a-3. . 

-.0866 

-.0896 

-.0476 

-.0056 

- .0446 

-.1266 

-.20.?6 

-.1966 

a-2. . 

+ .9314 

+ .7424 

+ .4844 

+ .2184 

- .1346 

-.1726 

-.1286 

-.0466 

a—1. . 

+ .1679 

+ .3904 

+ .6.I14 

+ .8484 

+1.0.Si9 

+ .9839 

+ .7819 

+ .4-189 

=. 

-.0195 

_1 

-.0400 

-.0514 

-.0420 

+ .0851 

+ .2221 

+ .4177 

+. 676.? 


pointed out by Grevillc, to —.0266). In that table the different coefli- 
cients for the first two and last two intervals arise, of course, from the 
inability of any symmetrical formula to reach the ends. 

The corresponding coefficients for ascertaining the single-age values 
from five-year group totals- a problem which is encountered frequently in 

















Constniction of Mortality Tables from Population Statistics 151 

handling population statistics-have been supplied by Greville for 
publication in tliis Study, and are shown in Table II. 

TABI.E II 

{Notation: 11*, = ?£•, + 1 + av+t + aVi a + w'x+O 

1»ART A-To im USEO FOR THE FIRST TWO iNTERVAI S 


X 




(VlEFFiaENT OF IF, To OBTAIN: 




Wo 



«'a 

W* 1 


ttV 

«'■ 


0. 

+ .3237 

+ 2586 

+ 1956 

I-.1370 

+ 08511 +.0.120 

+ 0094 

-.0114 

- .0205 

--.0195 

5 . 

-.1252 

- 0741 

> 0064 

+ 0680 

+ 1380i +.1936 

+ .22fi4 

+ 2296 

+ 2020 

+ 1-184 

10... . 

- 0786 

+ 0076 

H.0376 

+ .03IM) 

+ 0031' - 0248 

- 0396 

-.0281 

+ OLU) 

+ 0798 

15... . 

+ .1180 

+ 0136 

- 0384 

- 0520 

-.(H12i • 0192 

+ 0024 

I-. 01.16 

+ OHIO 

- (M)68 

20. 

- 0379 

- 0054| + 0116 

+ .0170 

1 

+ 0147 +.0084 

+ .0014 

-.0034 

- .0045 

-.0019 


Part H To Be Used Except for the First 'Pwo 
AND Last 'Fwo Tnteryai.s 


X 


Sn-IO.... 

- 0117 

5»-5.. .. 

+ .0804 

.5m. 

f-. 1570 

.5m+.5. 

-.0281 

5m|-1U.... 

+ .0027 


COKFFlf’lKNTS OF IF, To OBTAIN: 




-.(M)19 

+ 0048 

+ 01.56 

-.0272 

+ 2206 

+ .2448 

-.0104 

-.0272 

+ 0061 

+ .0048 


viknii 


+ .0061 

+ 1H)27 

- 0104 

- 0284 

+ .2206 

+ 1570 

+ 0156 

+ 0804 

• .0019 

-.0117 


Part C—To Be Used for the Last Two Intervals 


CoKKFiriH JTS or Wx To Obtain: 


X 

— 


1 

— 


«■* 10 

w,-, 

Wz ■ Wz 7 Wz-o 

w, , 

Z--25. 

- 0019 

- 0045 

- 00.14- + 00141 + 0081 

+ 0147 

s-20.. 

-.(M)68 

-1- 0100 

+ .01.161 + (KI24| - 0192 

- 0.112 

8-15. 

1- 0798 

+ 0130 

- 02811 0,196: - 0248 

+ (K)34 

3 -10.. 

+ . 1-184 

+ .2020 

+ 22961 + 2264! 19.16 

+ .1.180 

s-5.. 

- 0195 

.0205 

-.0114 + 0094| + 0420 

+ 0851 


W-Z « I Vz I 



IT* , 


+ 017» + 0110 
- 0520. - .IM84 
+ .0300' + 0376 
+ or>80| - 0004 
+ .1370 f- 1956 


.0054 
+ 0136 
+ 0076 
- .0744 
+ 2586 


-.0379 
+ 1180 
0786 
- 1252 
+ .3237 


(//) The preceding method is a process of reproducing inlerfjolalion. 
If, however, reproduction of the given values is not required, corresixind- 
ing sets of cocITicients for non-reproducing interpolation can be obtained 
on De Forest’s assumptions and the techni(]ue already discussed. Thus in 
R. 2 \.T.A., XXXIV, 2*1-29 (Tables 2, 5, and 7), (Ireville has stated the 
coefficients (except at the ends) for non-reproducing interixilations mini¬ 
mizing the mean square ernir in the (j + l)th differences for 7—1 (with 
3 terms) and for 7 = 3 (witli 5 and 6 terms), and has suggested with re¬ 
spect to the end intervals that the best practical solution will be to extend 
the given values in a reasonable manner at both ends so that the coeffi¬ 
cients deduced for the middle intervals can be used throughout. 











152 Population Statistics and Their C<nnpilation 

117. The methods and linear comixmnding coefllcienls which have 
been described in [lar. 116 arc all based on De T'orest's assumption (which 
also has been used extensively by many later authorities -see T.A.S.A., 
XXVI, 81 et scq.) that ifis an obser\'ed value of Vr, so that u — Ur + 
e,, the c’s are indei)endcnt random variables with zero mean and variance 
c®; and the interpolations are made by minimizing the mean square error 
in i.e., in the (/+ l)th difTerences of the inteqxdated values, in 
conjunction with the further supposition that the values of the true 
A^'^TV are actually zero. The development of this interpolation method 
by Greville in 1945, as already stated, is thus in reality an ai)plication of 
I)e Forest’s graduation procedures to the allied problem of interpolation, 
and it is so explained in [xir. 116 because it is important for this relation¬ 
ship to be undersloofl. This imix>rtancc is emphasized, moreover, by the 
fact that in 1944 (previous to Greville s work) 11. S. Beers had proceeded 
from a sc^mewhat dilTerent a.ssumption by a different tyjw of derivation, in 
which he did not make direct use of the theor}'' of minimizing the mean 
S(]uare error although that theory would have produced identically the 
same results as he obtained (see R..'V.r.A., XXXIV, 57). 

The different assumption from which Beers started was, in effect (as 
just noted), that the 5lh differences of the observed values are inde¬ 
pendent random variables with mean zero and varkince (say) the 
for the assumetl constant mean square error (variance) of each Ahtr in 
this assumption being adopted in this Study for the purix)sc of di.slinguish- 
ing it clearly from the assumed constant mean scjuare error (variance) of 
each Ur in I)e Forest’s basis which was followed by Greville and in par. 
116 here. Since under both assumptions exactly the same t 3 q)e of i)ro- 
cedurc is available in order to find the linear com|x>unding coeflicienls for 
minimizing the mean sciuarc error in the (j + l)th differences, it will be 
seen that on Beers’ assumptions, as in statement (r) for De Forest’s as¬ 
sumptions in ])ar. 116, it can be said alternatively tliat we are really 
minimizing the sum of the sc^uarcs of the (y + l)th differences, or (as ex- 
I)lained by Greville in T.S.A., 1, 354) w^e are minimizing the sum of the 
squares of the coefiicients which e,\’press the (y + l)th differences of the 
inteqwlated values in terms of the (j + l)th differences of the given 
values. 

(i) The first coefficients which were published on this assumi)lion that 
the {j + l)th differences of the observed values are inde|x?ndent random 
variables with mean zero and variance g® concemed a reproducing inter¬ 
polation with 7 = 4 (see 11. S. Beers, “Si.x-Term Formulas for Routine 
Actuarial TntcriX)lation,” R.A.I.A., XXXTII, 245). Those coefficients 
were derived by a methoil which included in its conditions a requirement 
that the sum of the interpolated values in any interval should equal the 



Construction of Mortaiity Tables from Population Statistics 153 

sum of the corrcsjwiiding values which would be obtained by onlinary 
5th difTcrence interpolation. (Ireville, however, ix)intcd out (in R.A.T.A., 
XXXIV, 37 -38) that it is preferable not to ini(X)se this condition, because 
it is satisfied automatically, and is therefore redundant, except in the two 
end inter\'als, aiul because in the end inter\'als it has the effect of giving 
undue weiglit to the given values at some distance from the interval. The 
improved coefficients for the end intervals which result from abandoning 
the condition were therefore included in the set shown in Table III here as 

TABLIC III 


Part A- To Hk Usf.o for thk First Two r\TKRVAi.s 





Cm- KKU'ii-Nrs UK ifx Til Oiitain: 




T'.l 

7'.4 



T'l.l 

»*1.4 

I'l.a 

I'l.i 

0. . . 


+ 4072 

+ .2148 

+ .0810 

-.0101 

-.040/ 

-.0280 

-.0101 

1... 


+ .s.m 

+ 1.0204 

+ l.0(>80 

+ .8101 

+ .62 «) 

+ .2840 

1-.16.S0 

2. 

-.1426 


- .24.S6 

- .16fW) 

+ .2.V1I 

+ .5011 

I- 7.=?24 

+ .02.S4 

5... 

-. 1(K)6 

- .(W76 

- .05S6 

.0126 

-.0216 

-.0616 

-.1006 

- .0»K)6 

■L . .. 


+ .1224 

1- .0884 

4- A).W 

-.0106 

-.oisi! 

-.IX)41 

+ .tX)60 

5.... 

- .028.^ 

-.OS28 

- .0244 

- .0115 

+ .IMK)8 

+ .(K)81 

+ .(X)52 

f .0015 


_ 

_ 

_ 


_ _ 


_ 

. 


Part B -Po Bk Pskd IOxckfifor thk 
First Two ani» Last Two Intervals 




CiihPKii'iKMS IIP Ux To Oiit.atn: 



I'bi- .1 


t'lii .a 

Till .a 

n — 1. . . . 

+ .0117 

1- 0127 

+ .(H)87 

+ .(K)27 

/f — 1. . . 

--.0*)21 

-.1101 

.0771 

- .0211 

«. 

+ .0224 

+ .7104 

f 4451 

4-. 1851 

/i-f-l.... 

+ .1854 

1- 4454 

+ .7104 

+ .0224 

/I-I-2. . . . 

-.0211 

-.0771 

-.1101 

-.W21 

«-|-2. 

+ .(M)27 

+ .(X»87 

+ .0127 

+ .0117 


Part C- 'I'o Be Used for the Last Two Intervals 





CilEKKIlll-JIlS OP Ux To Oiti \in: 




U 1.4 


S'i - 1 .4 

Tz-i.t 1 Vt .M 

7'j-.a 

I'f .4 

V*-.« 

3-5. . . 

+ .0015 

+ .0052 

+ .(K)81 

+ .(H)()8|- .0115 

- .0244 

» .0228 

- .028.? 

3-4. . . 

+ .0060 

-.(M)41 

-.0181 

-. 0100 ;+ . 0 .W 

+ .0884 

+ .1.^ 

+ 1079 

3-2. . . 


-.1006 

-.0fhl6 



- .0076 

-. l(X)6 

s-2. . . 

+ .9354 

+ .7524 

+ .5014 

+ .2241 - .1666 


- 2226 

-.1426 

3-1. . . 

+ .1659 

+ .2840 

+ .6220 

+ .84<»4! + I.W)8') 

+ 1 0204 

-P.8.S44 

+.4or/) 

z . 

-.0191 


-.0497 

-.0-WH- .0819 

+ .2148 

+ .4072 

+ .6667 













154 


Population Statistics and Their Compilation 

finally proposed by Beers in R.A.I.A., XXXIV, 59 (correcting those given 
earlier in R.A.I.A., XXXTTT, 258). 

The corresponding coefficients for ascertaining the single-age values 
from 5-year group totals were also stated by Beers in R.A.I.A., XXXIV, 
60, and are shown in Table TV here. 

TABLE IV 

{Notation: 11 ’, = w, + -f ?£',,2 + tt»,48 -f av+O 
Part A -To Hk Used for the First Two Intervals 


C'OKVKK’IKNT OF IF, To OBTAIN: 


X 

Vo 

Vfi 

w. 

V, 

V4 

Vk 


Vi 

Vg 


0. 

+ 333.1 

+ .2.595 

+.1924 
+ 0064 

+ .1329 

+ .0819 

+ O-lOl 

f .0093 

- 0108 

- 0198 

- 0191 

5. 

-.1636 

-.0780 

+ .0844 

+ .1508 

+. 2(NI0 

1- .2268 

+ 2272 

+ 1992 

+ 1168 

10. 

-.0210 

+ .01.10 

+ .0184 

+ .00.54 

-.01.58 

-.03-14 

- 0102 

- 0248 

+ .0172 

(-.0822 

15. 

+ .0796 

+ 0100 

-.0256 

-.0.1.56 

-.0284 

-.0128 

+ rN)28 

-I- .0112 

- 0072 

- (HI81 

20 . 

-.0283 

-.0(M5 

H 0081 

I-.0129 

H.011.5 

+ .0()(i8 

+ 0013 

- 0028 

-.0038 

-.0015 


Part B—To Be Used Exi'eptfor the First Two 
AND Last Two Intervals 


CoFFFU T^NT•^ OF To OlITVIN: 

: ■ : i 

I ‘‘'iflta 

. -.0117 -.0020 + 0050 -1- 0000 -j 0027 

Sh-S . 4-.0804 +.0160 -.0280 -.0400 - 0281 

5fi . +.1.S70 +.2200 +.2.K»0 2200 +.1570 

511+5 ... - 0284 -.0400 - 0280 1-.0160 +.0804 

511+10.... +.0027 +.0060 + (M)50 -.0020 - 0117 

Part C—To Be Used for the Last T'wo Intervals 

<’nl l-KH’II-NT-. IIF ir. To Dhtmn: 

w? 10 I W* 9 ' W> M ! 7 \ *«’.* IS ■ W- .. ■ W'r i \ • ; j j W.- , | , 

*-25.. -.0015 -.00.18 - 0028 +.001.l| + 0068! + 0115 + 0129 | OOW - 0045 -.0283 

*-20.. -.0084 + 0072 +.0112 +.(M)28 - 0128, - 0284 - 0356 - 02.'J6 + 0100 +.0796 

a-lS.. +.0822 +.0172 -.0248 -.0402 -.0311 -.01.58 + 0054 + 0181 +.01.10 -0210 

*-10. -E 1468 +.1992 +.2272 + 2268 + 2lM)0 1.508 0844 + 0064 - 0780 16.16 

*-5... -.019l| -.0198 - 0108 +.0093 +.0404 + 0819 1-.1.129 + 1924 + 2595 |- .1.133 

(it) When reproduction of the given values is not reijuired, coeflirients 
for a non-reproducing interpolation minimizing the sum of the squares of 
the {j + l)th differences when j = 3, and again using six terms for sub¬ 
dividing an interval into five (xirts, have also been suggested by Beers in 
R.A.I.A., XXXTV, 19. The corresponding coefficients for ascertaining the 
single-age values from 5-year group totals arc given in R.A.I.A., XXXIV, 
20 . 













Construction of Mortality Tables from Population Statistics 155 

117A. These intenwlatioii methods of pars. 115-17, which have not 
yet been used extensively, may be expected, under suitable conditions and 
with properly chosen coefficients, to produce greater smoothness in mor¬ 
tality tables constructed from poimlation statistics than the tangential 
and osculatory formulae of pars. 99 -111. In practice, however, it will of 
course always remain true -as De Forest, W. F. Sheppard, and others 
have emphasized (ci. Wolfenden, T.A.S.A., XXVf, 107 [footnote], 110 
[footnote], and 112, and Jenkins, R.A.T.A., XXXIV, 47 49 and 184) - 
that it is imix)rtant to select a set of coefficients which will be capable of 
dealing with the actual values of A’w, or or A*^w,. . ., by appropriate 
assumptions rcsi)ecting the errors in u or one of its orders of differences, 
and the value to be chosen for J. It is equally important, in order to gauge 
fairly tlie effects of different methods, that each procedure should be 
tested by examining the graduated or inteqwlatcd values of that order of 
differences whicli the method aims to adjust (even though other orders of 
differences may be examined also). The selection of method, moreover, 
should be made in aii}*^ particular case without attaching undue weight 
either to custom or to examples based on other data which may exhibit 
different characteristics for the problem of gnaduating or inteq^)lating 
l^opulation statistics, with their extensive and varied errors, is not in any 
sense routine. 

(.v) "'Interlocking'^ Interpolation Formulae 

118. Another type of interjwlation was suggested in 1948 by Aubrey 
While in 'l\.\.S.A., XLIX, 5.^7, which represents a com|)romise between 
the “continuous” viewpoint of the tangential and osculatory formulae 
(see f)ar. 115) on the one hand, and the “discrete” approach {Joe, cit.) of 
Beers’ mcth.od on the other. White remarked that the strict finite differ¬ 
ence analogue (appropriate for the discrete approach) of equal derivatives 
at the y)oints of junction (as in tangential and osculatory inteqxdation) 
would be to rcriuire the ecyuality of a specified number of central sub- 
differences and mean sub-differences of acljacent cur^'e segments at their 
points of junction, lie thus projxjsed to make equal at the border |X)int 
“either the first 2;/ -f- 1 of 1, ub', 5 ^ etc., or the first 2n of /x', 5', etc., 
depending on whether an odd or an even number of conditions are to be 
set,” where the primed symbols indicate that the sub-interval is taken as 
the unit in computing differences and mean values. It is thus seen, as he 
exjdains, that “this is equivalent to equaling the jjoints in the two curves 
involved in these sub-differences”; the conditions conscfiucntly b('come 
that “the two curves intersect at a number of jx)ints one greater than the 
highest order of differences set equal, the [xunts to center about the border 



156 PopuUilion Slalistics and Their Cmnpilaiion 

point and to be separated from each other by the given sub-intervar*; 
and “the resulting curves will resemble strands of wire twisted together, 
suggesting the term ‘interlocking* for the curves, and, by extension, for 
the points.** An algebraic expression can be written for such an “inter¬ 
locking** formula, and it is possible to think of a continuous curve segment 
in each interval as for the basic ideas of tangential or osculatory inter¬ 
polation; on the other hand, from the discrete point of view, the sub- 
differences which enter into the derivation of the formula are actually 
computed from certain discrete values of the function, and we may think 
of successive groups of discrete ])oints rather than successive curve seg¬ 
ments. As White ix)ints out, the well-known tangential and osculatory 
formulae of both the reproducing and non-reproducing types (such as 
those of Karup-King, Sj^rague, Jenkins, etc.) arc obtainable as special 
cases of the ai)propriate “interlocking** curve. 

No numerical application of these formulae has yet been published; 
their practical utilit}'^, therefore, remains to be determined. 

(.ri) Reproducing Suhlahulalion Minimizing the Sum of the Squares of All 
Differences of (liven Order by Difference-Rquation Operations 

lt9. Tt has been suggeste<l by C. A. Spoerl (I'.A.S.A., XXXVIII, 403 

sec also XLIX, 300), II. V'aughan (J.I.A., LXXIf, 401), and T. N. E 
(Ircville (Boletim do Institiito Brasileirode Atuaria, TI, 7), that subtabula¬ 
tion (i.e., the subdivision of the intervals between given values into a 
specified number of equal parts) might be regarded as an over-all opera¬ 
tion, in which each interpolated value will depend on all the given values 
and not merely on a limited number such as four or six, and the inter- 
ix)lated terms will have over the whole series the minimum sum of the 
squares of differences of any given order. Vaughan and Grevillc inde¬ 
pendently have described procedures, involving the solution of a differ¬ 
ence-equation, by which such an interpolation can be effected.* I’hese 
methods give, in the sense thus prescribed, the smoothest possible intcr- 
|)olation. However, the process is more laborious than the use of formulae 
limited to a finite range, and in many cases it is doubtful if there would be 
a great difference in the numerical results in comparison, for example, 
with a reproducing minimized difference interpolation of the type de¬ 
scribed in [xirs. 114-16. It is unlikely, therefore, that these procedures 
will ever be used extensively. 

* It is interesting to note that White has shown (T.A.S.A., XLTX, 357) that certain 
‘•interlocking” formulae, dcrival on altogether diiTercnt assumptions, yield inter¬ 
polated values identical with those produced by this approach. 



Construction of Mortality Tables from Population Statistics 157 
(xii) Pivotal Values 

120. Fii some of the experimcnlal rccoiislructions of the English life 
tables which were performed by George King, as recorded in par. 99 here, 
it was necessary first to calculate certain preliminary quinquennial 
values, by ordinary interfMdalion formulae for the bisection of decennial 
grou])S, in order to provide a set of equally spaced (but ungraduated) 
quinquennial values from which the intermediate terms could be found 
by tangential or osculalory interpolation. King subsequently (in J.I.A., 
XLIII, 109) evolved a development of this procedure which has been used 
on numerous occasions. In that development the data (usually the deaths 
and |X)pulalions separately) are arranged in 5-year age groups (which, in 
accordance with the principles of jmrs. 53-54, may be assumed to be 
correct in total); from these groups quinquennial adjusted (i.e., “grad¬ 
uated”) values, called pivotal (or siimetimes “guiding”) values, are cal¬ 
culated for the central year of age of each group; the resulting pivotal 
values of (jx arc computed therefrom; and then the intermediate values 
of (fx at each age are filled in from these pivotal values by tangential or 
oscillatory intcqxjlation.* 

King ol)tained his pivotal values by ordinary differences. If Ux denotes 
the (lata (deaths or populations usually) for the year of age x to .v+1, the 

x=in f-1 

5-year age groups = 7c»„, say, are taken for quinquennial values of 

T n 

n. In the simplest practical case of determining the central value wy from 

1 9 14 

the three groujis ^ 2 /,, ^ *2,, and ^ , which together constitute 

0 5 in 

the scries lU ). . . 2214 , we sec that ^ tix — ^ «z = 3^8 — y? say, 

0 0 

n -1 

where ^ Ux so that—using A for a 5-year interval 
0 

fi I 4 1 » 1-4 

U U » 

* Sinc(‘ the pivotal values com|)uted hy the several methods discussed in pars. 120-24 
here are, as stated, graduated values, the student’s understanding of the principles in¬ 
volved may he assisted by the explanations of the nature and objects of graduation 
which are given in “The Fundamental Principles of Mathematical Statistics,” pp. 89-90 
and 120. 



158 Population Slatistkw and Their Compilation 

But 

y* “ > + - I iffA’yo + 

and 

3^7 = )^o 4- }^yi) + 5io^^3'« 

whence 7/7, being ys — yi — •2AyQ + . 2 A® 3 ^n — .008A®yo 

= .2^y, - .()08A»3f« 

= .27£'5 - .008A®Wo (49) 

= .2167^5 — .008(7£»(| + 2£»io) . (50) 

This formula can also be deduced easily by the method used by F). L. 
De Forest and shown by 11. H. Wolfenden in T.A.S.A., XXVT, 85-87. 
It is correct to fourth differences of y and therefore third differences of fi. 
After King had concluded that it gave satisfactory pivotal values, it was 
used (in conjunction with the Karup-King third difference tangential 
reproducing formula of jjar. 101 here) in his official construction of the 
English Life Tables No. 7 for 1901-10 and No. 8 for 1910-12 (see the 
Supplement to the 75th Annual Reix)rt of the Registrar-General for 
England and Wales, and J.I.A., XIJX, 297), and subsequently by Sir 
Alfred Wats/m in No. 9 for 1920 22 and No. 10 for 1930-32 (Registrar- 
GeneraFs Decennial Supplements for 1921 and 1931, and J.I.A., LIX, 
125), by L. A. Bullwinkle in the Northern Ireland Life Table for 1925-27 
(Registrar-GeneraFs Review of Vital Statistics of Northern Ireland and 
Life Tables, 1926), and by Sir Alfred Watson in the Scottish Life Table 
for 1930-32 (Supplement to the 78th Annual Report of the Reigstrar- 
General for Scotland, and T.F.A., XVI, 67). The formula has also been 
adopted on many occasions in several other countries, e.g., in New 
Zealand (see L. S. Polden’s jiaper already mentioned in |>ar. 61 here), in 
Australia (see the paper by F. W. Barford noted in par. 127, and the 
Australian Life Tables 1946-1948 prepared by W. C. Balmford, Com¬ 
monwealth Actuary), in Canada (see M. D. Grant’s construction de¬ 
scribed in T.A.S.A., XXXV, 8), and in the United States (as stated by 
T. N. E. Greville in the U.S. Life Tables and Actuarial Tables, 1939-41, 
p. 123). These repeated instances of the use of King’s formula are at¬ 
tributable in Britain largely to the advantages resulting from uniformity 
of treatment in a long series of tables, and also in all cases to its simplicity, 
ease of application, and often sufficiently satisfactory results. The popu¬ 
larity which it has thus attained, however, should not be taken as indicat¬ 
ing either its theoretical or iiractical superiority over the more modem 
pivotal formulae described in j^ars. 121-23 (see also Wolfenden’s com¬ 
ments in T.A.S.A., XXXV, 291-92). 



Construction of Mortality Tables from Population Statistics 159 

121. Since these pivotal values are determined from the grouped data 
by an elementary process of gnaduation, it may sometimes be advisiiblc to 
use a formula based upon more groups, and higher differences (where 
necessary), in order to obtain a wider basis for the method. When differ¬ 
ences of the m’s up to order j are retained, so that in effect Ux == -1 + 
Bx + CV + . . . -f- we can usc^ + 1 groups (with the corresponding 
range) and still obtain a direct solution; and with 5, 5, . . . symmetrical 
groups, as in (49) and (52) here, the formulae are the same for j = lor 3, 
4 or 5,. . . Thus, taking Ux as a continuous function and the origin at the 
centre of the range, the re(|uired value for the central year of age is 

jT^ Uxdx=^ J ^ (A+Bx+Cx^+ . . ^^•^ = -* + 12 + 86’^- 

The given numerical value, w, of any group comprising / terms, in which 
the abscissa of the middle ordinate is <i, is similarly ecjuivalent to 

t 

+Cjr*+. . . +/x») dx , 

“ 2 

from which i4, J5, . . . arc immediately expressible in terms of w; and 
hence the above central value 

+ - ’ 

or any other non-central value or group of values, follows at once in 
terms of w* As examples, with 3 groups and 7 = 2 or 3 we get King’s 
formula (49); with j = 3 and four consecutive groups of the range 
«o . . . «i9 we find 

u^\ = .1165 (a»r, + Wio) — .0165 (zc'o + W15) 

= iV f(^’5 — .165A2 i£?o) + (wio — .165 A-zc»6)1 (51) 

as used in Henderson’s “Mortality Laws and Statistics,” pp. 75 and 94; 
while with five consecutive groups and 7 = 4 or 5 

Mi 2 = .221376w;io - .011584 + ^,5) + .(X)0896 (i£»o + W20) 

= .2wio - .008A=W5 + .000896A^W(,, (52) 

which is the pivotal formula to fifth differences of u {j = 5) used by King 

* As shown by E. L. Dc Forest- see II. J I. Wolfenden’s paper “On the Develop¬ 
ment of Formulae for Graduation by Linear Compounding, with Siiecial Kefcrcmce to 
the Work of Erastus L. De Forest,” T.A.S.A., XXVI, pp. 81-121, where the deriva¬ 
tions for several cases are given in detail. 



160 Population Statistics afid Their Compilation 

in XLllI, 114, and there demonstrated in the same way as (49) 

here. 

122. * The pivotal values in the methods of pars. 120 and 121 arc all 
found by ordinary intcr|wlations from groups. Since the real objective in 
the calculation of those values, however, is to obtain reliable points which 
represent the data adequately but yet remove any undue fluctuations, 
and also because the subsequent tangential or osculatory interpolations 
based thereon have a tendency to show undu1ati(ms and points of inflexion 
(cf. par. 107 liere), it has been suggested (first by J. Jiurhanan, with sub¬ 
sequent investigations by W. A. Jenkins) that the pivotal values might be 
determined from the quinquennial sums by osculatory instead of ordinary 
inteqxilation. 'Fhe pivotal value formulae so derived are: 

(i) From Sprague’s 5th ililTcrcncc reproducing osculatory formula 
of par. 1(K), 

Wi2 = .2«»io - .(X)8A-*^£•5 + .(K)64 A‘k'„ (53) 

as deduced in Jenkins’ paper in 'I'.A.S.A., XXXI, 14. 

(ii) From Jenkins’ 4th difference osculatory non-reproducing formula 
(describctl in jiar. 107 here), 

«i2 - .2win - .008A22 c* 5 - .fK)019375A4fu (54) 

as also given by Jenkins in T.A.S.A., XXXT, 13. 

{Hi) From Jenkins’ 5th difference osculatory non-reproducing formula 
of par. 107, 

«i2 = .2«»io “ .(K)8A“W»ft — .(X)42 A‘k'o (55) 

as suggested by Buchanan in 'F.F.A., XIT, 128-29. 

The practical effects of using these formulae for the pivotal values, and 
then completing the interjiolations by the corresponding osculatory 
formula from which the pivotal expression was derived, were tested in 
Buchanan’s and Jenkins’ pajKTS. With regard to smoollinc.ss (as would be 
expected), Sprague’s reproilucing formula as a basis h^r both the pivotal 
and subse(]uent calculations was improved slightly by the 4th difference 
non-reproducing method, whiih in turn was improveil by the non-re¬ 
producing 5th ilifference process, while of course the order was reversed in 
respect of fit. 

123. The preceding formulae are all based upon direct intcr|)olation, 
without imtwsing any conditions uiion the coefficients of the w's\ and they 
represent in each case a unique solution. If it is desired, however, to cm- 

•Tlic wordiiiR nf this paragraph is taken largely from “'Phe Fundamenlal Prin¬ 
ciples of Mathematical Statistics,” p. 130. 



Construction of Mortality Tables from Population Statistics 161 

ploy more groups lliiin in tlie preceding cases without including more 
orders of difTcrenccs it would be necessary to restrict the coefficients in 
some way in order to obtain a complete solution.* The pivotal values, say 
in the preceding formulae are immediately expressible in tlie “linear 
com|)ouiKr* form 

Vr = IrUr + (/r+lWr-H + /r-lWr-l) + (/r|2Wr|2 + /r-2Wr-2) + • • • ^ 

(56) 

“f" (lr\-nUt-\.n "4" Ir—nU-r—i^ 


by putting the u"s for the tc’s; and when differences above order 7 are 
neglected S(j tliat //j. — .1 + Bx -f + . . . + Jxt the general form of, 
say, ro becomes 

Cn = .1 f/n + Ui + l-\) +■ (^2 + / 2) + • • • + (Jn d' l-n)\ 

+B\(h - / 1 ) + 2 (/o - 2 ) + . . . + ;/(/, - /.„)] 

+ C[V (h + / ,) + 2^ (I 2 + /-2) + . . . + (In + /. n)| 

+ I)\V (/| - /_,) -I- 2 ^ (/. - / 2) + . . . + //* (L - /- n)] 

+ ... 

~ (say) rod + ciB + CoC + C:J) +- 

Consequently when / — 0 , so tliat Ux = .1, we must have Co = 1; when 
/ ~ 1 and //, — -I + Bx, it is necessary and sufficient lliat Cu = 1 and 
Cl “ 0 ; and generally w’hen dilTercnces above order j are negligible the 
conditions are that c» — 1 , Ci — 0 , C 2 — c; — 0 . "I'hese conditions 

cover both unsymmetrical and symmetrical cases; and for the latter, 
since /„ = / they reduce to co — 1 , C 2 = 0 , C 4 = 0 , etc., and are the 

same when / — 0 or 1, 2 or 5, 4 or 5, . . . When, for example, an odd num¬ 

ber of (|uinquennial grou[)S of values are used, as in (49) and (52), (56) 
becomes, with the groups taken centrally, 

r» = ... r' (u 12 +... u.h) - 1 - if {u 7 +... u ») + p (u 2 +...+ ^2) ^ 

-H f/ ('^1 + til) + r (uh +•■■+ 1/12) +•■•/ 



and ( 57 ) and its resulting conditions are modified similarly. 

Tn the usual case when / = 5 a comjdete solution is obtained directly 
by taking three symmetrical quinquennial groui)S, so that (/ = q' and 


*'rh(' siiliscqiicMil arranj;c'mi’nl of this ])arai;ra])1i is taken from 11. 11. Wolfenden's 
paper mmlioned in the footnote to par. 121. The terms “linear roinpound** and “linear 
compounding” were first used in actuarial literature hy W. F. Sheppard, and were 
siibseciueiitly adopted in the [lapcr just noted (see also “'J'he Fundamental Principles 
of Mathematical Statistics,” pp. 132 and 282-8.?, for a condensed statement of the 


principl^ underlying “graduation by linear com|MJunding”). 



162 


Population Statistics atui Their Cofnpilalion 

r = r' = 0 and the conditions fo = 1 and C 2 = 0 become 5p + lOg = 1 
and 5^ + 135^ = 0 , whence p = .216 and q = —.008 as in (SO). With 
five symmetrical groups (r = P and q = f/'), however, the conditions 
Co = 1 and C 2 = 0 can give a complete solution only when the three un¬ 
knowns p, 7 , and r arc reduced to two; and (as ix)inted out in LI, 

368) if this is done by taking p = q, so that the conditions become lOr + 
15^ = 1 and 140^ + 510r = 0, we obtain without, it is to be noted, any 
introduction of the method of least squares—Henderson’s formula (origi¬ 
nally given in the First Edition of Actuarial Study No. 4, p. 22) 

^ - 14wiiol (5 9) 

where 

x-ni-2 

x-fl-2 

If, however, p 7 , there are three unknowns and the two conditional 
equations 

5^+ IO 7 + 10 r=n 

5/^ + 1357 + SlOr = 0 / ‘ ^ 

If wc now assume tliat each u is affected by an error e of which the mean 
square is ^ we may determine the “best” formula (i.e., that which will 
secure the greatest possible reduction of the mean .square error in r, on the 
assumptions made), by making the mean square error of t’o, or iJ/J, that is, 
5^® + KV + minimum subject to (60). From these three condi¬ 

tions, therefore, we find that 

= \ [Xmwo +• .488^15 - .136w±iol - (61) 

This formula was first given anonymously in J.T.A., LI, 368.1'he main 
objective in determining the pivotal values is that they must represent 
the original data satisfactorily (and {lermit the eventual construction 
therefrom of a smooth curve); consequently it may be held that they 
should “fit” the <lata in accordance with the best available a priori 
criterion of fit; and this formula accomplishes that objective by effecting 
the greatest ix)ssible reduction in the mean square error of w, which on 
the assumptions made gives the same result as a best “fitting” according 
to the criterion of least squares (sec “The Fundamental Principles of 
Mathematical Statistics,” pp. 283-84 i>articularly, and pp. 91-97, 106-8, 
and 322 28 with resf^ect to the method of least squares; and T.A.S.A., 

XXVI, 100-105, where on p. 105 Wn = ^ m*) . It has been used by 



Construction of Mortality Tables frofn Population Statistics 163 

T. N. E. Greville in T.A.S.A., XLV, 227 (see also II. 11. Wolfenden, 
T.A.S.A., XLVI, 97), and in the U.S. Life Tables and Actuarial Tables, 
1939-41, p. 124, for populations and deaths of Negroes between ages 32 
and 72 and “other races” at ages 32 to 87, where ils graduating power 
and its underlying assumption of “fitting” a lliird degree curve by least 
squares were considered to be appropriate and to be preferable to King’s 
formula (50). 

Since the pivotal values, as just emphasized, should fit the data in ac¬ 
cordance with some adequate criterion (which in (61) is taken to be that 
of least squares) and yet must permit the eventual construction of a 
smooth curve, a necessary compromise between tlie incompatibilities of 
maximum fit and smoothness* is inherent in the selection of the pivotal 
formula for the first step and the interix)Iation formula for the second. 
This compromise can of course be reached by using a graphic graduation 
to find the pivotal values (as illustrated by W. A. Jenkins in 'f.A.S.A., 
XXVlll, 206). Another profjosal, made by A. R. Reid ami j. H. Dow (in 
“Graduation by the General Formulae of Osculatory Interjwlation,” 
T.F.A., XIV, 185), was to use the flexible b term in (48) for the pivotal 
values so that the expected and actual deaths should be equal, and in the 
interpolations to yield maximum smoothness therefrom.f 

124. (i) All the pivotal formulae of pars. 120 -23, in accordance with 
the usual procedures, provide expressions for the calculation of central 
values. In any Sf)ccial case, however, a non-central formula can of course 
be deduced easily. For example, in constructing the 1911 -15 and 1916-20 

* It may he well to reprodure here the riillowinK remarks from “The Fiinilameiilal 
Principles of Mathematical Statistics,** p. 120: “When the graduation is accomidished 
by fitting a mathematical formula it will he clear that the main criterion must be a 
test of goodness of y//, since the values derived from the grailuation pnx'ess will lie on a 
mathematical curve, and therefore will be inherently ‘smooth.* If, however, si^inc other 
method of graduation is employed which dws not necessiirily place the graduated 
values upon an inherently smooth curve, it will evidently be necessary to test the re¬ 
sults for smoothness as well as for goodness of lit. In this connection it shfiuld be noted 
also that in such a graduation of irregular data it will otiviously not be practicable to 
secure a best possible fit and greatest possible smoothness at the same time for the 
ultimate interpretation of a ‘best possible fit* would require the precise reproduction of 
the original data, without any .smoothness having been attained. It is therefore neces¬ 
sary in such cases to settle the criteria for fit ancl smoothness so that the practical re¬ 
sults may he satisfactory, rather than best, in both respects.** 

t Reid and Dow also illustrated the following alternative uses for the b term in 
(48): {a) b taken numerically to make the formula resemlde Rveretl’s formula closely; 
(6) the pivotal values found as in (<i), but in the interpolations b taken to secure maxi¬ 
mum smoothness; and ifi) a double application of method {b). 'I’he maximum smooth¬ 
ness was prescribcrl by making 2 a minimum. 



164 Population Slnlistics and Their Compilation 

New Zeiiland tables (see the 1921 census “Report regarding the Con¬ 
struction of Life Tables,” and Polden’s paper noted in par. 61 {Hi) here) 
a non-central formula (which emerges at once by pulling n = 1 and 
X == —2 in I)e Forest’s general expression (7) of T.A.S.A., XXVT, 87, 
covering King’s assumptions of pjir. 120 here) was used to determine the 
pivotal values at ages 10, 15, 20,, from data in a “5-9” grouping. 

(it) At the two ends of the table, the pivotal values which are not 
readied by a symmetrical formula can be found (<;) by the corresixmding 
unsymmetrical formulae which can be derived easily by the methods al¬ 
ready described, or (6) by finite difference extrapolations from the values 
already determined by the symmetrical formula. An instance of (</) is the 
«2 = •2a»o — .(X)8A‘2 i»oi corresponding to King’s (49) herein for the central 
term of the middle group, for the central term of the first of the three 
quinquennial groups in the series «o • • • Wii, which was used in the U.S. 
Abridged Life Tables, 1919 20, p. 34. Examples of {b) may be found in 
J.I.A., XLIII, 119 and 134, and in the U.S. Life Tables and Actuarial 
'riiblcs, 1939-41, pp. 124-25. Tn some cases, however, it may be preferable 
to complete the ends of the table by the methods discussed in pars. 127- 
128 here, without thus extending the sc?ries of pivotal values. 

(m) Tn King’s own applications of his pivotal value method, and in 
nearly all other instances of its use, as noted in par. 120, the pivotal 
values are determined from the populations, and deaths, />, separately; 
then the corres|)onding pivotal values of qx follow immeilialely therefrom 
(by the appropriate formula expressing q in terms of D anil P) ; anti finally 
the inteqxilations arc made from the pivotal values of qx so found. As a 
variation of this procedure, and because the data were available at each 
age, Sir Alfred Watson in preparing ICnglish Life Table No. 9 made an 
experimental construction by determining quintpiennial pivotal values 
of qx directly (i.e., without oixsrating on the deaths and populations 
separately). It was found in that instance that the final table differed 
inappreciably from the results of King’s method of using the separate 
series of deaths and fxipulations (see also T.A.S.A., XXIX, 334; and 
J.I.A., LIX, 128 and 211; and T.F.A., XII, 129 and 149). 

{xiii) Curve-filling Methods 

125. Instead of using the graphic or interpolation methods of pars. 
96-124, appropriate curves can sometimes be fitted to the |)opulations 
and deaths, or to functions thereof. Such curve-fitting methods necessarily 
produce smooth graduations (and without the undulations which may 
appear in the results of some of the inteqxilation processes), although for 
that reason they also generally show greater deviations from the un- 



Construction of Mortality Tables from Population Statistics 165 

adjusted data. The uses to which tlie final table will be applied must 
therefore be considered in deciding the extent to wlii('h closeness of fit 
may be sacrificed for the inherent smoothness which results from the 
fitting of a mathematical formula (cf. J.f.A., XLVIJI, 211, and XLIX, 98 
and 542, and “The Fundamental Principles of AFathematical Statistics,’* 

p. 120). 

(a) In a few instances where the data were available for each age, 
Makeliam’s first formula (by which colog or nix is represented as 
a + or lx is taken corrcsixjndingly as ks^j*’'') has been used (see, for 
examples, II. L. Rietz and C. II. Forsyth, R.A.I.A., T, 9, and H. L. 
Trachtenberg, J.R.S.S., LXXXIII, 656). W'here the data are available or 
reliable in age groups only, the fijrmula could he determined from the 
central ordinates of each group, which would be found by deducting from 
the group total l/24th of the 2nd central difference (see Sir (leorgc F. 
Hardy's Institute of Actuaries’ Lectures on “'Fhe 'J'hcory of the Con¬ 
struction of Tables of Mortality,” i)p. 57 and 87, and Henderson’s “Mor¬ 
tality Laws and Statistics,*’ p. 76). 

Although for many tables based on |M)pulation statistics the i)rinciple 
of “uniform seniority,’* which follows from ]Makeham*s formula and 
greatly facilitates the compulation of joint life annuities, has not been 
considered to be iinixjrlant, it is nevertheless true in moilern practice 
that Makchamized graduations of whlely used population tables may be 
valuable. In the U.S. Life Tables and Actuarial 1'ables, 1959 41, there¬ 
fore, T. N. F. (Ireville prepared a very satisfactory Makcham version of 
the table for total whites, with the uniform seniority values at ages 17 or 
over and adjustment factors at younger ages for joint life annuities, and 
.also with adjusted ages for the calculation of ai)[)roximale joint life 
annuities on lives of different sex.* A detailed account is given there of 
the valuable method, originally due to C. F. Hardy, l»y which the 
Makeliam constants were determined so that the annuity values of the 
previously constructed non-Makehamized table should be reproduced ;is 
closely as possible (sec also the ilescriptions in the referemes staled in 
“'riic Fundamental IVmciples c)f Mathematical Statistics,*’ p. 122). 

(b) For the reasons explained in the following par. 126, however, it is 

* .Separate Mukehani graduations for males and females would have required dilTer- 
ent values of the constant c, with cnnsiHiucnt lf»ss of the “unift>rin seniority’* method 
for lives of dilTerent sex; it was aeeonlingly felt that it was preferalile to apply Makeham’s 
formula to the table for tot.il whiles willioiit si‘p.iration by sex, .'ind then to deal with 
the calculation of joifit life annuities on lives of dilTerent sex by using the Makehamized 
table for both sexes together on the b.asis of specially .adjusted ages (which generally 
required an adilition to the age for males and a deduction fer females) from which the 
equal ages of the uniform seniority table could be taken. 



166 


Popidation Slatistics and Their Compilation 

usually found that Makcham’s first formula must be modified by intro¬ 
ducing additional constants. Thus Hardy showed (see his Lectures, op. 
ciL^ p. 88) that for the 1901 male population of Kngland and Wales the 
logarithms of the central ordinates (determined as stated in (cz) lierc) of 
decennial age groups, or log /', could be represented in the form log /' = 
A + Bx + CV + according to Makeham’s second formula (by 
which /ix, colog px, or ;Wx is taken as a + yx + and L corresiJondingly 
as ks^v/g^). ^'he same principle was employed in the pre|iaration of the 
tables for the British National Insurance Act, 1911—log (instead of 

log fx) being represented by the form just given, and the deaths being 
dealt with hy fitting to Id^ a basic curve of the same form together with a 
similar supplementary curve determined so that log IV' (base) + log 
^d' (supplementary) — log ^d' (final table), as shown in J.I.A., XIATI, 
553. 

(c) Makeham’s first and second formulae are, of course, modifications 
of Gompertz’s original geometrical [)rogression /Xx = From this 
log^/xx = u + hx where a = log,. Yi and i = logrC, so that Gomperlz’s 
formula can be written in the exjxinential form gx = 'J'hree im¬ 
proved and more elastic expressions (/) /li, = (ii) tuc =- 

i^iqia.xUjx* ^ _ gfflo+«jxi«j,x*ifi,x*4a«x* ],ave been suggested 

by H. L. Trachtenberg (“The Wider Application of the Gompertz Law of 
Mortality,” J.U.S.S., LXXXVIT, 278; see also J.I.A., LXIIT, 45), with 
illustrative applications to the English Life 'fables. By taking logarithms 
of each side these expressions can be fitted easily by moments or least 
squares (in which connection see also “The Fundamental Principles of 
Mathematical Statistics,” j). 326, with resi^ct to the weights to be as¬ 
signed in a least squares fitting of log /Ux). The last form (///), wliich gave 
the best results, provides for the two points of inflexion, which arc often 
found in the curve of log /Xx, at ages b and c as determined from the rela¬ 
tion 

x/2 

;/^.2 Mx = (-V - b) (.V - f) . 

(J) Because the curves of and allied functions such as /ix, Wx, 
colog px, and log /ix, usually present difficulties when any attempt is made 
to fit a single curve over tlie whole span of life, numerous suggestions liave 
been made by which further modifications of Makeliam’s formula, or 
other expressions of ciuite different character, might be fitted to the data 
over large or small ranges. For the purposes of this Study it will be suffi¬ 
cient to refer the reader to “The Fundamental Principles of Mathe¬ 
matical Statistics,” pp. 79 -85 and 319, where various formulae which 
have been profxzsed by Hardy, Buchanan, Lidstone, Perks, W'ittstcin, 



Construction of Mortality Tables from Population Statistics 167 

Steffeiiscn, «aml others are stated, with indications of their applicabilities 
and proi)erties (which in some cases include modified forms of “uniform 
seniority”)- 

{e) Type T of Pearson’s system of frequency cur\^es (for which sec 
“The Fundamental Principles of Mathematical Statistics,” pp. 65-74 and 
512-19) was applied extensively by T. G. Ackland as a means of repre¬ 
senting; the age distributions of the {lopulations of India at llic census of 
1911 (see XLVTT, 515). The results on that occasion, however, 

were considered to be unsatisfactory—II. G. W. Mciklc in his 1921 
J<e]H)rt (noted in par. 52 here) having concluded that the rates of mor¬ 
tality deduced therefrom showed at some ages unjustifiable departures 
from those which the data really indicated (see also J.I.A., XLVII, 407). 
Another instance of the ajiplication of Pearson’s frequency curves to 
population statistics occurred in the preparation of the first national life 
tables for Kgypt from the censuses of 1917 and 1927, where M. R. Kl- 
Shanawany Effendi used the curves to effect drastic redistributions of the 
data which were disturbed seriously by digit selections (sec J.I.A., 
LXVIII, 188). 

(/) The use of curves to rc|)rcsent the curtate or complete exj^ectation 
of life (o or has also received some attention, because it may be 
claimed that in the calculation of those functions a certain amount of 
graduation has been done implicitly (sec J.I.A., XLI, 95), and because 
afterwards qx could be computed easily therefrom (even if some further 
smoothing were re(|uircd to reach completely satisfactory values of the 
basic r/x which is, of course, the primary function required). Being a 
gradually decreasing curve, logio c, = a -f bx H- + dx^ + fx^ was sug¬ 
gested by G. F. Hardy (in his I..ectures, op. cit., p. 79); Dr. J. Brownlee 
used €x — ma' + nb^ with considerable success (see T.A.S.A., XXV, 154, 
and the references therein); the formula 

ga nK 

gtCX — _ 

where K is the complete expectation and e is the base of the Napierian 
logarithms, has been ayiplicd to 25 of the U.S. Life Tables, 1910 (sec 
J.k.S.S., LXXXIV, 455); and J. F. Steffensen has experimented with the 
reciprocal (to produce an increasing series) in the Makeham form 

-! = A A-Bc' 

fz 

in two papers in the Proceedings of the Fifth rnlcmalional Congress of 
Actuaries, II, 247, aiul Svenska Akluariefiircningens Tidskrift, 1917. 



168 


Population Sfalislics and Their Compilation 

(g) 'I^he advantages which flow from the inherent smoothness of 
mathematical curves can sometimes be secured efTectively by the useful 
device of graduating data by reference to some appropriate standard 
table from wliich irregularities have already been removed. Thus it has 
been suggested (using primed symlxils to denote the values of the stand¬ 
ard table) that 



is a function of small values which progress slowly (see G. J. Lidstone, 
XXX. 212), or that 

T I 
y,/ and J; 

could be used in rcs|x?ct of jx)pulations and deaths at age .r and beyond 
(see Henderson’s “Mortality Laws and Statistics,” p. 60). For certain 
sections of the 1921 census of India the population data were represented 
satisfactorily by fitting drd degree parabolas, usually of the form 

T 

y = 1 — tf .r — 6.v2+ c.v», to log 

(see Meikle’s Report referred to in par. 52 here, and the similar principles 
used in Vaidyanathaii’s subsequent 1931 Report; and A. Henry, J.I.A., 
XLVH, 407). The selections of appropriate standard tables and graduat¬ 
ing formulae in the practical application of this method give wide scope, 
of course, for variations in judgment and ingenuity, and the inherent 
smoothness of the results, as in all these curve-fitting procedures, leaves 
only the goodness of fit to be tested satisfactorily. 

(r) OLDKST AC.ES; AND “jUVKNIUC” AGES, i.C., BETWEEN 
INFANTILE AND ADULT AGES 

126. On lip. 78 and 82 of “'Hie Fundamental Principles of Mathe¬ 
matical Statistics” diagrams are shown for the tyi>ical curves of L and i/x 
(and Wx, Mx, and colog ^,), with the object of clarifying the nature of the 
problem involved in the graduation of those functions.* 'Fhe tyiiical form 
of the curve of qx (and of ;Mx, Mx, and colog pj) from infancy to old age is 
(as there described, p. 81) “a contorted U-shaped curve, with the mini¬ 
mum in the neighbourhood of age 11, so that from about age 11 to the 
end of life the curve increases steadily (often with two minor undulations 
in the thirties and seventies) w’ith its convexity towards the .r-axis (cf. 
W. Perks, J.l.A.,LXllI, 41, 45, and 55, and J. S. Elston, R.A.l.A., XII, 

* These diagrams have been reproduce*! by M. D. Miller on p. C0<i of the mono¬ 
graph “Elements of Graduation” previously mentioned. 



Construction of Mort-ality Tables from Population Statistics 169 

88). If, then, the values up to about age 10 are dealt with separately, the 
remainder of the curve from the region of age 11 upwards will usually be 
found to change slowly at first, and at the older ages to resemble more 
nearly a geometrical progression -a circumstance which means that often 
the logarithms of the values there approximate to an arithmetical pro¬ 
gression. The [xjints of inflexion, however, introduce great difficulties into 
the problem of finding any expression which will represent mortality 
rates over the whole period of life.” Furthermore, at the oldest ages (be¬ 
yond about age 90 or 95) the data are often both inadequate and unre¬ 
liable (cf. par. 41 here). In consequence, the preceding methods of section 
(B) for the adult ages are usually satisfactory only until about age 90 or 
95 at the one end of the table, and age 11, or even 15, at the other. Beyond 
those ages special methods are generally required. 

127. At the oldest ages, on account of the paucity and unreliability of 
the data, it is often found that different methods of estimating the rates 
of mortality produce widely differing results, so that it is frequently desir¬ 
able to compare the indications given by several jmeesses. There is con¬ 
siderable ju.stification, moreover, for terminating the tables at a reason¬ 
able, even if arbitrary, age by some more or less artificial method, (leorge 
King, for instance (in J.I.A., XLH, 238 and 252, and XLIIT, 119), first 
used ordinary interpolation from a short range of preceding values with 
= 1, where the limiting age (a was fixed arbitrarily; subsequently, 
however, in his constructions of the official English Life 1'ablcs Nos. 7 
and 8, he emyiloyetl suitable values as the basis of extrajKilation without 
assuming q — \ at any age (see also J.I.A., XLTI, 287 and 2‘X), anil XLfX, 
315). fn connection with these finite difference methods at the oldest ages 
where the calculations are often based on irregularly spaced values, the 
Ncwton-Shepf)ard system of adjusted differences, with its avoidance of 
the solution of simultaneous equations (as suggested in J.T.A., LVTII, 
310) was emi)loyc(l by F. \V. Barford for the interpolations at ages 88 to 
104 from r/s.'i, qm qm^ 792 . and = 1 in his construction of the Australian 
Life "J'ables A^^“® and A**'®® (see his paper on “Australian Population Mor¬ 
tality, Census of 1933,” Actuarial Society of Australasia, 41st Session, 
1936). In the next English Life 'Fable No. 9, on the other hand, the iiuli- 
cated values of q^ began to decrease above age 100, so that King’s methods 
were found to be unsatisfactory; but it was observed that the ratio 

log ( lO^w ) 

log (lO^Hl) 

was approximately equal to 

log (10^94) 

log (lO^w) 



170 


Populalion Statistics and Their Cofnpilation 

so that a Gompertz graduation (cf. par. 97) was adopted from age 85. 
Gompertz graduations vrere also found to be appropriate from age 87 in 
the English Life Table No. 10 and the concurrent Scottish Life Tables. 
The Life Tables (1926) for Northern Ireland, hovrever, showed that such 
a method would be unsuitable, and a Makeham graduation based on pj, 
at ages 70, 80, and 90 was eventually employed. In the United States Life 
Tables, 1890, etc., Wittstein’s formula (par. 125(d) here) 

fH 

was used. An interesting comparison of several methods, namely, (1) 
constant second differences for log qx from ages 77, 82, and 105, (2) Witt- 
stein’s formula based on those same ages, (3) Wittstein’s formula using 
ages 67, 82, and 105, and (4) constant third differences using ages 72, 77, 
82, and 105, has also been given by Henderson (T.A.S.A., XXXV, 279) 
for determining the values of qx between ages 87 and 102 on the basis of 
M. I). Grant’s Canadian data. 

128. At the other end of the table, from the infantile ages 0-4 to about 
ages 10 15, the curve of qx changes so rapidly that special methods arc 
frequently necessjiiy in order to represent it properly and at the same 
time to secure a smooth junction with the infantile and adult values. In 
Farr’s method the matter was dealt with by a special adjustment to wio, 
as mentioned in par. 97. In King’s i)apcr in J.I.A., XLIII, 124 and 135, 
the values were supplied by calculating a third difference from known 
values, and in English Tables Nos. 7 and 8 he used Lagrange’s formula 
(see footnote to par. 52) to fourth differences, althougli the results were 
not altogether satisfactory. In English Life Table No. 9 Sir Alfred Watson 
preferred a third difference interpolation from four non-cquidistant 
values, and the same method was adopted in the Northern Ireland Life 
'Fables (1926) -the arithmetical procedures and the Newton-Sheppard 
system of adjusted differences being discussed further in J.I.A., LVIII, 
60 and 310, and being used and illustrated also in Harford’s Australian 
paper mentioned in |3ar. 127 here. 

A significant departure from such methods, however, Wcas found to be 
neccssaiy in Sir Alfred Watson’s reports on the English Life Table No. 10 
and the comjxirablc tables for Scotland based on the census of 1931 and 
the deaths of 1930 -32, because the abnormal variations which occurred in 
the birth rate during and after the war of 1914r-18 had caused such irregu¬ 
larities in the census enumerations at ages 11 to 14, and in the deaths at 
each age during the three years, that interpolation based on the pivotal 



Construction of Mortality Tables from Population Statistics 171 

value of q \2 (amongst others) was felt to be unreliable. In order to secure 
a close relation between the deaths and the populations from which they 
arose, the following procedure was therefore employed (see the references 
in i)ar. 120). The deaths at age 12, say, in 1930 32 would have aristui from 
the births 


(i r+-| i8i'+ 8 1 + (I 1 iSf +1 i /Sf) 


= - 112 say , 

where (in the notation of par. 91 here) jS” represents the births in the wth 
(juarter of 1917, and denotes the births during the whole of 1917. Also, 
since the census date was taken as at the end of the 4th month of 1931, 
the jxipulations enumerated at ages 11, 12, and 13 would have arisen 
from the births 




A corrected value for mi 2 was then taken as 

Deaths aged 12 in_^030 -32 ^ /iia 

Census Populations aged 11, 12, and 13 /I 12 ’ 


and similarly for each age from 6 to 16; and finally these values were 
suitably graduated. 

In view of the admittedly inferior results given by King’s use of 
Lagrange’s formula in Knglish Life Tables Nos. 7 and 8 , T. G. Ackland 
suggested in J.I.A., XLIX, 336 (see alsci p. 344) that better results could 
be obtained by deducing </x for each age from 7 to 10 by oscillatory inter- 
IX)lation (on the basis of equal intervals) from ^ 1 , the unadjusted q^, and 
the pivotal values and (/is, wdiile those for ages 11 to 16 could be de¬ 
rived similarly from the unadjusted t/e and the pivotal f/n, and 721 . 
Other osculatory methods may also be employed, 'flius Ackland in J.I.A., 
XLIX, 372, illustrated the process of osculatory inter|wlation for unequal 
intervals which may be useful in some cases; J. Jluchanan, in J.T.A., 
XLII, 385, and 'Frans. 6 th Int. Cong. Actuaries, 11, 615, suggested that a 
fourth degree curve could be used through three ages five years apart 
(such as 5 , 10 , and 15) “so as to have at its ends the same slope as the 
curves to which it is there joined”; and R. Henderson in “Mortality Laws 
and Statistics,” j). 59, described a process which involves the determina¬ 
tion from known ordinates of the first and second differential coeiVicients 
at ages such as 5 and 10 , these differential coefficients then being used in 
Sprague’s osculatory formula which in terms of differential coefficients 




172 Foptdalion Statistics and Their Cmnpilation 


(td and tt") may be written 




(/-/0’(/2+3///+6A2) A(/-A)®(/+3//) hHt-h)^ 

-/5--/4."T‘"x "“‘ 2/3 


+«x+l 


A®(10/2-15/A+6A2) 


. A-M/-A)(4/-3A) . 

tH ■ ■ ’/4 * ■"'‘'■"rl* 


A« (/-A) 5 
2 /» 


for interval jr to .v + /. 

A further illustration of the necessity for special methods at the juve¬ 
nile aj^cs is the procedure devised by T. N. K. Greville in the U.S. Life 
Tables and Actuarial Tables, 1939-41 (pp. 126 and 136), where “the rates 
for af?es 5 to 11 were interpolated from a special third degree curve de¬ 
termined so as to reproduce the calculated rates of mortality at ages 4, 7, 
and 12, and to have the same first derivative at age 12 as the Karup-King 
curve used for intcqxdation in the age interval 12 to 17.” Writing the 
Karup-King formula (par. 101 here) for the age interval 7 to 12 as 




. A*(i-S) _ , I 


250 




where i* = S — Greville remarked that, since reproduction of (/7 and ^12 
jind c(|uality of the derivatives at age 12 would be satisfied with any value 
of S*q 7 , an artificial value c for can be used so that the value of q^ 
will be rejiroduced. 'Hie preceding formula with / = — 3 thus gives <74 = 
1.677 + .7686 — . 6(712 — . 2885 ^ 712 , whence, by substituting 5 “(/i 2 = 717 “ 
2 (/i 2 + 77 and solving, 

€ = (12571 — I6477 + 3712 "h 36 (/i 7 ) 

which was used as the artificial value of Srqj in the Karup-King formula.* 

* An analogous device, for which the determination of the artificial value c there 
rcciuired is also shown by Greville (he. cit.), was used for ages 28 to 31, where a special 
third degree curve was found which would have the same ordinate and first derivative 
at age 27 as the Karup-King curve used in the age interval 22 to 27, and the same ordi¬ 
nate and first derivative as Jenkins’ non-reproducing formula (47) of this Study which 
was used in the interval 32 to 37. 




vni 


THE CONSTRUCTION OF ABRIDGED LIFE 
TABLES FROM POPULATION STATIS'IMCS 


129. When it is anticipated that a mortality experience will form the 
basis of extensive financial calculations it is necessary to construct a com¬ 
plete life table, with the consequent monetary functions, in addition to 
the preparation of the fundamental graduated rates of mortality, </,, at 
each age. If, however, it is intended to use the data merely as a guide to 
the mortality it is not necessary to proceed beyond the determination of 
reliable values of qx\ and in comparing such rates with those of other com¬ 
munities or classes a life table need not be constructed, for com|)arisons 
may be elTeclcd immediati'ly between the rates of mortality themselves. 
Kven for such simple ])urix>st*s, however. Medical Health Ollicers have 
fre(]uently considered it essential to calculate certain life-table functions 
particularly o; and although this is not nci'cssary, and In some cases may 
even be misleading (see Section IX here), some of the short methods of 
|)rocedure arc valuable contributions to the general theory of life-table 
construction, and will tluTefore be considered here. 

The general principle of these abridged methods is that, when the ilata 
are taken in age groups {x to x -j-;/), the ;i-year probabilities „l>x can be 
computed approximately and hence the values of L at the corresponding 
age intervals follow immediately from an arbitrary radix. 1Tien, in order 


to pass to Cj, which is 



/ -o 



( I 


/x 


or l<i I'x, which is 


/ 



t -u 


/x 


it is only necessary to pass from these values of „px or L at //thiy intervals 
to the sums of tpr or of lx\i or of Lx^ t within each interval, and thence by 
summation to the values of Cx or Cx for nthly values of x. 

L^O. (i) Dr. Iutrr\s Method 

In this method, which was introduced in connection with certain local 
life tables based on the liS4l census of England and Wales (Suj)plement 
3Sth Registrar-Clcnerars Rei^ut), was found from the births and 
deaths, and m at each age from 1 to 4 from the death and ceiuus returns, 
hor subsequent age-groui)b covering n years (// — 5 or 10) the ratio of 
deaths to ijopulation as in jxir. 97, or ^nix say, was used to give „/>x by 

173 



174 PopukUian Statistics and Their Compilation 

means of the assumption tliat 



From these values of npx those of lx were then determined for ages n years 
apart, and thence SL, and e, = at such ages were found on the as- 

•X 

sumption tliat the years of life experienced between ages x and x + n 
were. 

This method of deriving npx would be satisfactory if px followed a 
geometrical progression; but since this is not generally the case the result¬ 
ing values arc not very reliable. The assumption as to tlie years of life, 
also, usually overstates the true values, with the result that the final 
values of ix are considerably overstated, cs]%cially at the higher ages 
(see Newsholme’s “Vital Statistics,” 3rd Kdn., pp. 282-84; and J.I.A., 
XLIII, 78). 

131. (ii) Dr. Hayward's Method 

In order to correct the overstatement which follows from taking the 
years of life simply as ” ih + lr+n) hi Farr's method. Dr. Hayward di¬ 
vided tlie n years into k parts and calculated the years of life as 

n fi 2ii (it l)ii 

i: lx + lxp^^lxp^ + lxp'^ , , lxp~^ +/xu 

I 2 "^ ' 2 2 

_ « 

where 

2-„wi, 

^ 2+nmx’ 

He found that the quinquennial groups 5-9 and 10 15 need not be sub¬ 
divided, that the decennial groups 15 24 up to 65 74 should be divided 
into two, that the groups 75 84 and 85-94 should each be taken in four 
equal parts, and that for the final group at ages 95 and upwards the 
calculation should be made in yearly stages (k = //) -the value of p in 
this last group being found by extrapolation from the values of p for the 
four preceding groups (sec the references to Newsholme's Vital Statistics 
and J.I.A., XLIII, in par. 130 here). 

This method has been used by Dr. Hayward in the construction of a 
series of tables of h for England and Wales for each decennium from 1841 - 


Ih+lx^n 


f/xP* 


!-/>'* 

n 

\-pk 


Jr 



175 


The Construction of Ahridged Life Tables 

1900 (J.R.S.S., LXIV, 636, and LXVI, 366), and also by Dr. Dunlop in 
Scotland. The values so obtained are reasonably close to those of the ex¬ 
tended methods, as may be seen from the comparisons in J.T.A., XLII, 
280 83, which show a fairly uniform understatement (not exceeding .12) 
in ?x for Knglish Life Table No. 6. 

132. (ii7) George King's Method 

'Phis process, which was described by King in the Supplement to the 
75th Registrar-GeneraPs Re|)ort, Part I, p. 26, and J.I.A., XLVllI, 294, 
is an abbreviation of his extended methods, as follows: (1) “Pivotal” 
values of L' and for quinquennial ages are first found by formula (49). 
(2) The quinquennial values of Wx, and thence p» and log pz, arc then 
computed—tlic log px at the end of the table being supplied by constant 
third or fourth differences from the preceding values. (3) The quinquen¬ 
nial log px is tlien differenced three times, and log (= log pxA" 
log px-w + • • • + log px\^ is calculated therefrom by the ordinary finite 
difference formulae 

iog bpa = 5 log pa + 2A log pa “ .4A® log pa + .2A* log pa (62) 
for the first value; and 

log hpx = 5 log px..h + 7A log px-h + L6A2 log px-x. - .2A* log px-u (63) 

for the second and subsequent quinejuennial values. (4) Then, taking a 
suitable radix for L, log /, and hence U are formed quiiKjuennially by the 
relation log = log /,+ log upx- (5) In order to pass now to c,, the 
quinquennial lx is differenced three times, and the sum of the values for 

f-5 

each age in each interval, that is ^ Ixvt for quinquennial values of are 

t-'i 

computed by the ordinary finite difference formulae 

^ I / = 5/n + 3A/« - .4A2/a +. 2A»/„ (64) 

£■=1 

for the first value, and 

2 1 ,+^ = 5/,-5 + 8A/,-b + 2.6a’/x b -.2A»/_b (fiS) 

t 1 

for subsequent quinquennial groups -the last values at the end of the 
table being filled in empirically; and from these quinciuennial sums of I we 
immediately find e* by summing from the bottom upwards and dividing 
by/,. 



176 


Population Statistics and Their Compilation 

This mclhofl is well adiipted to machine calculations; and the routine 
procedure is described minutely in the U.S. Abridj^ed Life Tables, 1919- 
20, and also by C. C. Grove in J.A.S.A., XVIIT, 1028 (see also T.A.S.A., 
XXV, M-5). In the U.S. Tables the formulae were applied in the cor¬ 
responding e.Ktended forms which do not involve difTerenciiig—such as 

101()g:,/>,. ^ 24(l(>g/»„ + log/>„,:,) -[- lOlog^it:, — 10 loginf 2 log ir, 

for (62). Grove also suggested that the determination of the pivotal 
values may be omitted although that is a matter which would have to 
be decided from the nature of the data in each case (see J.f.A., XLIX, 
.LS2); and he found the last few quinquennial values of (and thence log 
pj^ at ages 97,102, anrl 107 by adding to the values of at preceding ages 
hy])othetical first differences, A^, determined by multii)lying the first 

differences of a standard table, A«', by the ratio 

In King’s ()apers the abridged method was not apidied at ages before 
11 (or 12), for which a pivotal value is obtainable by the central formula 
(49). Tt may therefore be noted that in the U.S. lables Miss Foudray 
(‘ompleted the values to age 0 by determining r/n, i/i, and q» as in i)ar. 
^yiiiii), calculating a pivotal value at age 7 by the non-central formula of 
par. 124(//) here, and then finding log :,p 2 by formula (62) which was modi¬ 
fied on acc'ount of the* skewness of the cur\'e (as suggested by Henderson) 
by taking the coetlicient of A*’ log p 2 as unity in aciordance with known 
values of log ipn- Grove applied Henderson’s method of par. 92(//) to 
find qx for each age from 0 to 4; and since the calculation of the pivotal 
values was omitted, w? was taken as the ratio of deaths to |)o))ulation for 
age-grouj) 5-9, from which q 7 follows, and then (/., and r/c were inter])olated 
by a hypothetical first difference based on a standard table as described 
above for the oldest ages. 

155. (/r) Method Given by Editors of J.I.A. 

fn J.I.A., XLVIIl, 301, the Editors of the Journal of the Institute of 
Actuaries suggested and illustrated the following rapid method of pro- 
cec?ding directly from the ;w’s, which gave very clcjse results: (1) Since in 
the life table 



let denote the central death-rate per 5 years for a 5-year age group,* 

* This nolatiun is rclsiiiird hero in order to rxliilnt the fornuilac as originally given 
and to facilitate understanding of the numerical examples referred to in the J.I.A. It 



The ConstriuTwn of AbriilgeJ Life Tables 177 

5 ( /.-(xi J 

that is, ' KvaluatitiK the (Icnoniinator by formula (1‘)) of 

King’s rnsliliilo of Adiiarics 'IVxt-Bonk, Part If, j). 47fi (or taking 
a 0, // - 1, and r ^ 5 in the general form of llie lailer-Maclaiirin 
expansion given on p. IS*) of Frereman’s Matliemalus f«.r Ailuarial 
Sludenls), we gel 

^(1 - up.) 


Heme 


and 


"■ {: (1 I- ..y>j - - ups-n .,■ 

^Px — . 

1+2 w/“^+ (\^Px I Swi'"’ 

( ' ) 

:.Ax ^ 11 + ,o fg/fs - Pt) I . 


If wc now assume that — px 4 \ (ef. par. *>7) the expression 

within the braiket becomes 


(?) fi) I («) («) 

‘.j « - tUx :: -= I - nix- & a|)|)roximateIy «/,, 


where «o is llie usual first central dilTereiire of W'oolhouse’s notation; so 
that 

login upx - — K - ywx’ ’ 


where k is the modulus. (2) 'fhese values may now be graduated, if de¬ 
sired, or intermediate graduated (|uin(|ueiinial values may be obtained by 
inter] )olat ion. 

(,<) (.'alculate 

"^‘1 ■ lx 

by the continuous formula 

1'>K I 1 <'!r ’ j “= •<>}? I./'x + lOK [ 1 + J 51 . 




(4) Obtain 


V/x 


/x 

sliDiilil he nolcrl Ihal 5 f..Wx). 'Hic liasic relation in ( 1 ) of this paragraph is also 

ileiluccd in l<..A.I..\., XXXII, 34 -,%, in Utips of „/Wx. 



178 


Population Statistics and Their Campilation 

by Woolhouse’s formula for approximate summation (see Freeman, op. 
cU.^ 195, and King’s Text-Book, Part II, formula (27) on p. 478) which 
gives 

^(/*+. + /.+M+...) +2 + 2^^= «i‘42-2M., 


in which /ix is taken as approximately 




*+6 


10 /, ■ 10 


The numerical work by this method is very convenient, as may be seen 
from the examples in J.I.A., XLVITT, 302-3. 

134. (v) Reed and MerreWs Method 

This method was published originally in the American Journal of 
Hygiene, XXX (No. 2, September, 1939), 33, and was reprinted in the 
U.S. Census Bureau’s Vital Statistics Special Reports, IX, No. 54. It 
again utilizes a formula connecting ^pz and and (as sliown by T. N. K. 
Greville’s examination of its mathematical basis in R.A.I.A., XXXH, 
29-43) it is in fact closely related* to Method (iv) here. 

Denoting U — /*.|5 by trfx, and Tx — T ,|5 by iXx, 

and since by definition s^x is ^, 

bLx 

and as also we have the relation 

—i(^) 

as for in» and Lx, it follows that 

b^x “ bLx • 6 ~ “■ (bI'x) > 

and the Euler-Maclaurin summation formula then gives 

/.= = . . . [ ( 66 ) 

since 

(sI/x’bWx) (sWx) j . 


Now colog hP» = log Iz — log lz\.b\ expanding the logarithm of the ex¬ 
pression in braces in (66), noting that /x+s = /x — bLz^b^h, and simplifying, 

* For this reason the method is included in this Section as Methixl (v), although 
actually it was published later than Methods (w) and (mi). 



179 


The Construction of Abridged Life Tables 
we obtain (cf. also R.A.I.A., XXXII, 34-36) 

125 d 

colog, ip, = S {itn,) -I-- y2 ^ (itn,) +- (67) 


If it is further assumed (cf. jiars. 97 ami 1.13) that 

„ '+- 

im, = * 

2 

approximately by Gompertz’s formula, it follows that ~ (5 w,) = k (5 w,) 

approximately where k = log^ c. Substituting this value in (67) gives 
finally the basic formula of Reed and Morrell’s method 

cologe 5 />x = 5 (b wO + 125a ( 5 W,) 2 where a = . ( 68 ) 


By fitting curves based on this explanation to 53 of (Clover’s 1910 life 
tables, they arrived at the value .(X )8 for a, and on that basis published 
tables showing the values of over a wide range of for w = 5 and 10 . 

With these values an abridged life table ran be constructed “in less 
tlian two liours,” and the results are usually suniriently accurate for most 
practical purjxiscs (see R.A.I.A,, XXXTF, 38). 

135. (vi) E, C, Siunv's hfclhod 

'Fliis method was published, with a great number of examples, in Part 
TI of the 75th Annual Refwrt of the Registrar-General of lOngland and 
Wales (see also J.I.A., LII, 393, and J.A.S.A., XVfT, 1025). I'he first 
assumption is that in any sectional [population npx can be expressed as a 
function of the observed death rate (the ratio of deaths to population 
between ages .v and x + w) the constants in the assumed relation being 
determined from those which were found to exist for the whole of England 
and \\ ales in 1910-12 for similar values of nW*x (not necessarily for the 
same age group). At ages under 10, was found to be given with sufiicient 
accuracy for the purjposes in view as unity less tlie “infantile mortality 


rate” c., 1 — as frc(|iicntly given in Medical Officers’ re[)orts (see 
footnote, par. 79); pi was expressed as ^ + / » *'^*'*^ 


were calculated by the relation npx= f — gnWx. At ages over 10 the 
quadratic form npx == </ + c(nWx — eY was used with quinquennial or 
decennial age groups; and at the final ages above 85 the ty[pe adopted was 
log = a — bnfnx. Having found npx fmm the observed nWx by this 
method, the column /^at the available age intervals then followed directly. 
In order now to pass to Cx, the sums of / in the corres|)onding intervals 



l«S() Population Statistics and Their Compilation 

were Jissumcfl to be cxi)rcssil)lc similarly in terms of n^x—the relations 
beinj; of the form 


for intervals 2 -5* ami 5-10, while at higher ages the same form, or a 
ronstant relation, or the ijuadratir d* + f'(e' ± was used according 
to the range of npx- h'inally .V' was found by summing .Yj!,,, and Cx 
followed at once.f 

In th(* reix)rt previously mentioned some 280 al)ridged tables for dif¬ 
ferent parts of England and Wales were construclecl by this method. The 
calculations were greatly facilitated by the tabula!if)n f)f the values of npx 
and nhx corresfx>nding to nUh and npr resf)eclively for usual ranges, thus 
providing a simple procc<lurc by which a very accurate abridged table 
('an be constructed “in three or four hours.” I'hc method was also used to 
compute c|uinquennial values of for Ncjrthern Ireland in 1801, 1^)01, and 
1011 (see the Registrar-Generars Review of Vital Statistics of Northern 
Ireland and Life Tables 1026, p. 57). Those preliminary tabulations, how¬ 
ever, were necessarily preceded by a great amount of experimentation; 
and as they may not be applicable to populations of other times or places 
the method would liardly be applicable to other cases without consider¬ 
able inve.stigalion. 

156. ivii) Construction by Reference to a Standard Table 

((/) This process was given in J.F.A., Lll, 596, and is in elTect a simpli¬ 
fication of the J.r.A. meth(jd of jKir. 155. As in that method the relation 
used to give colog p from the ;w’s by age grout)s is 

cologin npx = (l + Y0 • 


* In the (Icsrriplion given here, Y' as now prewTihed by the new international nota¬ 
tion is adopted instead of llie, former open-faee«l letter used by .Snow and the actuaries 

/V^-- V* 

of his day. For interval I -- 2, necessarily = 1. Snow denoted by nkx- 

/x lx 

N* 

t In determining h the usual. ^ J ^ which at higher ages can be aptdied con- 

* e 

tinuously in the form ^ ^ ^ was modilicd in the first year of age to 

allow for the uneven distributions of deaths therein, in accordance with the formulae 
for La in par. here. 



The Construction of Abridged Life Tables 181 

Denoting functions of the Uilile under construction by dashes, and those 
of a standard table without dashes, it therefore follows that 


colog colog 



Tn orfler now to pass from npt so found to Ct the relation employed was 
(•1 +1/*^+ • . • + t - lpT -\- ilpf ) ~ 2 ^^ tPi) + • ilptlit \ t ~ fijr ) 


by WoolhoHsc’s formula for approximate summation (see rreeinan, loc. 
cil., and formula (26), p. 478, of King’s 'I'ext-Uook), from which 

(l+iPz + ‘ . •+ l-lpr+hpr) = (i -l-l/»x + . . .+(- Ipx+hpr) 

+ 2f(l + /^) - (1+,^)| 


approximately; and from the values calculated hy this formula e, follows 
by the continuous formula 

= (i + + . . . + l-tfix + a/^x) + tl>J‘r \/. 


This method is very rapid and convenient, as may be seen from the 
example p;ivcn in Lll, 396. 

(i) A further simplilication in the method of construction l)y reference 
to a standard table has been introduced in the preparation of the abridged 
life tables for whites and non-whites by sex which since 1945 have been 
published each year for the United Stales by the National OHice of Vital 
Statistics (see the United States Abridged Life Tables, 1945, by T. N. K. 
Greville, in Vital Statistics Sjxjcial Reports, National Office of Vital 
Statistics, XXI11, No. 11, for a discussion of the assumyUions involved, 
and Part 1 of the annual volume “Vital Statistics of the United States”)- 
Tf the death rates by ages in the standard table and in the table under 
construction are known to be very close, it may be assumed with sufficient 
accuracy that 


The /' values are then constructed on the basis of the figures so found, 



182 Populalion Slalislics and Their Compilation 

and is calculated from the approximate relation 






The results arc then summed to give T^, and h follows from division by /,. 
In these abridged U.S. tables the values of qo employed arc found by 
formula (18a) here (see also par. 79). 



DC 


METHODS OF COMPARING THE MORTALITIES 
OF DIFFERENT COMMUNITIES* 


137. As stated in par. 129 it is not necessary to proceed beyond the de¬ 
termination of reliable values of or w* at each age in order to compare 
the mortalities of different communities—for by that method the exact 
nature and extent of the differences in mortality may be observed, not 
only at the several age periods of life, but also sis between the male and 
female sexes. Such comi)arisons at each age, however, arc somewhat 
laborious; and a more compact method, which also has tlie advantage that 
it partially eliminates accidental errors when the ungraduated rates are 
employed, is therefore to compare the probability of dying over 5 (or 10) 
year age-periods by using (or The use of such values, however, 
necessitates cither the construction of a life table or the calculation of log 
npx by the short formula of pars. 133 and 136. 

138. In order to employ the original data without the preliminary 
construction of an extended or abridged life table, therefore, several more 
compact measures, in the nature of index-numbers, have become widely 
used. Of these, the first was the crude death ralCy which is merely the ratio 
of the deaths at all ages to the population at all ages. The crude rate for 

males may thus be written symtolically as where the S denotes the 

sum (of the male deaths d" or of the male populations 1^) at each age (or 
age group). This may be written also as 


V(/>™jm) 

2‘(/»“) 


(69) 


where is the rate of mortality —. 


2 (P/) ’ 


or 


Similarly the crude female rate is 


2 : {PfqO 
2! iP') ■ 


(70) 


and the crude “persons” rate is 

2:(^) ’ 


or 


S (P"?") 
2:(P'>) 


(71) 


• This section is largely founded on II. H. Wolfenden’s pajUT in J.R.S.S., LXXXVI, 
399, “On the Methods of Comparing the Mortalities of Two or More Communities, and 
the Standardization of Death-Rates.” The notation which was there used as licing 
most convenient for the purposes in view is retained here. 

183 



184 


Population Statistics and Their Campilation 

fin 

where of course = d”’ + d^, = /^ + and . This formula 

may also clearly be written as 

i: (</”• +JO 2 (7”” V"+7^/90 

V (Jim _jl pj^ » " V (^pm /V) • 

These two forms (71) and (72) are identical, and, of course, give the same 
numerical results. 

1'hese crude rates are the weighted averages of the death rates at the 
various ages, where the weight is taken as the iM)pulation P of the par¬ 
ticular community. Such rates are clearly unsatisfactory and misleading; 
for two communities with exactly the same rales of mortality at each age, 
but with (lifTerent relative values of may show radically different crude 
death-rates, on account only of the iliffercnt values of P, that is, by reason 
solely of the fact that the two communities have different age and sex 
constitutions. 

Although the crude death rate is usually stated for all ages from birth, 
it may, of course, also be computed similarly for ages above any age .r. 
Also, instead of being founded u|X)ii the original |)opulation data, as is 
usually the case, the graduated data as derived frrmi the life table may be 
used; and in that case the crude death rate above age .v becomes 

+ . + + . ■ . . 

+ + « . lx +. . . . 

according as the central death rate or the probability of dying as 

t-*X lx 

found from the life table, is used as the basis. Tn the first form above 
(sometimes referretl to as the “life-table death-rate”) it has attained some 

prominence - because, being , it is the reciprocal of c,. 11 must therefore 

be noted that all the above crude rates, including this “life-table death- 
rate” and its reciprocal iv, are weighted averages in which the weights are 
derived from the data of the particular community under examination. 
The weights used in the crude rale (69), for example, are the original P% 

and those used in the life-table death-rate and ex are the JL's of the life 

table. 'Fhe rales (69)-(72) are thus dependent on the particular original 
age (and sex) constitution of the community under examination, as shown 
by its P’s - such constitution resulting from the jjarticular birth, death, 
and migration rates which the community has exi)erienced. The “life- 
table death-rate” and Cx are similarly dei)endent on the particular age 
(and sex) constitution of tlic community as shown by the Vs of the life 
table—those Vs resulting from the particular death rates of the com- 



^f€lhods of Comparing the Mortalities of Different Communities 185 


muiiity under examinalion. Tn none of tliese cases, therefore, arc the 
weights independent of the ])articular community. Conseciuently, if, say, 
the actual mortality rates in one community arc double those in another, 
the crude rate and the life-table death-rate will not necessarily (or 
usually) be doubled. Comparisons of such rales may therefore frequently 
lead to erroneous conclusions (cf. J.U.S.S., LXXXV, “Discussion on the 
Value of Life-Tables in Statistical Research,” |)i). 544-46, 552 54, and 
555; J.J.A., LIT, 595; and T.A.S.A., XXXV, 281 82). 

159. The foregoijig defect in the crude rate as a weighted average may 
be remedied by the employment of a system of weights which are inde¬ 
pendent of the age and sex constitutions of the particular community. 
C'onsetiuently the Rcgistrar-Ceneral of Kngland and Wales in 1885 intro¬ 
duced the standardized death rate. I'lie direct method of computing such a 
standardized rate is to take a standard i)i»pulation, and to apply to this 
standard {xipulation the death rates, at the various ages and sexes, of the 
particular community; and then, by dividing the deatlis so computed by 
the standard pof)ulation, to deduce the “standardized death rate” which 
would emerge if the mortality rales of the f)articular community were to 
prevail in the population of standard age and sex constitution. 

The proi)er formulae for these standardized rates so as to maintain any 
fundamental relationsiiii)S wliich may exist between and are 

shown in the paper referred to in the footnote at the commencement of 
this section to be 


Standardized male rale , 


Standardized female rate , 


IS 

2: f/’i'/) 


„ , . 

Standardized jiersons rate , 


>(7S) 


where 7^, J^{, and J'! (=- 7^ )- ^'0 the standard male, female, and 
persons i^ipulations. 

'Phis is the system used since 1915 by the Registrar-Cleneral of Kngland 
and Wales. It was also adopted in the U.S. Census bureau’s RejKirt on 
“Mortality Rates, 1910 1920” and in some of the annual reixirts, and in 
the Canadian re|X)rts on Vital Statistics. 'Phesc “directly standardized” 
rates, being absolutely indeixMident of the age and sex distributions of 
the particular community, may be compared with confidence. 

140. Tn numerical applications of this method the distributions of the 



186 Population Statistics and Their Compilation 

standard ix)pu1ation obviously should be reasonable, for otherwise undue 
importance would be ^iven to the death rates of the particular com¬ 
munity at those ages where the standard jwpulation might be abnormally 
large. The standard which has been employed widely is that of Kngland 
and Wales in 1901, because there are practical advantages for purposes of 
comparisrjii in retaining the same standard as long as jwssible. That 1901 
population has relatively few infants and old people; other standards 
have therefore been suggested (see T.A.S.A., XVllI, 280), of which one 
of the most important is that recommended by the international Sta¬ 
tistical Institute and comi)oscd of the imputations of a number of Kuro- 
pean countries in 1900 or 1901. in order to reflect the considerable change 
which has occurred since 1901 in the age distribution of the standard 
population of that date, the Registrar-Cleneral of Kngland and Wales in 
1940 introduced a Comparatirc Mortality Index, which for any year z is the 
ratio of the standardized rate for year z to the standardized rate for 19d8 
when both those rates are based on the mean of the age distributions of 
the years 1938 and z (see the “Registrar-Gcnerars Statistical Review of 
England and Wales for the Six Years 1940-1945,” published in 1949 as a 
review for the whole six years in consequence of the disturbances due to 
the war). 

In connection with this problem of choosing an appropriate standard 
population, it is essential to remember that directly standardized rates 
are really index-numbers which are constructed only for purj^oses of coin- 
j)arison, so that no signilicance is to be attached to their absolute mag¬ 
nitudes; the main cliaractcristic of the standard population conser|ucntly 
should be that it is not unnatural or clearly abnormal. In practice it will 
be found that various standard populations chosen within reason have 
insignificant effects uixm the inferences to be drawn from comparisons of 
directly standardized rates for dirTerent gcf)graphical areas at all ages 
from all causes (see, for an example, the Census bureau’s “Vital Statistics 
in the United States, 1900-1940,” by F. E. Linder and R. J). Grove, pp. 
80 81). Care, however, must of course be taken in drawing inferences from 
comparisons of standardized rates in respect of subdivisions of a general 
population, c.g., for certain causes of death or occu|xitions, in which the 
age distributions of the imputations exix)scd to risk may show' peculiar 
characteristics sharply different from any normal general population see 
Sections XI and XII here. 

141. The work of calculating and applying the death rates of the par¬ 
ticular community in the “direct” method of par. 139 is considerable if, as 
is usual, it has to be done for a great number of districts (see, for example, 
the numerical illustration in Newsholme's “Vital Statistics,” p. 228). An 



Methods of Comparing the MortaUlics of Di fferent Communities 1S7 

approximate indirect method has therefore been devised by the Registrar- 
Ciencral’s oflice whirli docs not necessitate the calculation and use of the 
death rates of the particular community at each age (or age-group). In 
this method an “Index Death Rate” is first calculated, being that crude 
death rate which would be shown by the ixipulation of the district if the 
mortality of the standard prevailed therein. That is, “Index Death Ratc” = 

2 :1 (Population of JJistrict) X (Death Rate of^J^indard) ] 

Total Pojmlation of District 


A “standardizing factor,” being 

Index Death Rate of Standard 
Index Death Rate of District 

that is. 

Crude Death Rate of Standard 

Crude Death Rate of District which would be sliownif standard 
rates prevailed therein 


is now com[)uted this factor giving a measure of the extent to which the 
crude rate for the district is affected by variations in age distribution 
shown by its t)of)ulation as compared with the standard i)oi>ulation. 
'riiorefore, multiplying the crude death rate of the district by this stand¬ 
ardizing factor, a death rate for the district is found as would appear if 
the age distribution of the district were the same as the age distribution 
of the standard. 

'I'his indirect method is applied in the re|)orts of the Registrar-Cieneral 
mainly in the presentation of “persons” death rates the se|)arate male 
and female rates not being given, because the persons rale may be con- 
sidtTcd to give the most comprehensive single figure for the rapid com¬ 
parison of one district with another. 'Fhe formula so used for the “per¬ 
sons” rate is 


2 (/’"(?x; ( /,«+ />/) 

V .p /V) ■ V .p y>/) V +/V,y/) 


(Viide 

“ixTSdiis” 
r:ito of 
romnuinity 


“PiTsons” 
rail* of 
stanrlarcl 


Kcciproral of 
inilpx “i>crpf»ns” 
ilialli rate 
nf coniiniinity 


Standardizing Factor 


. (74) 


A numerical cxamjilc of its ii|)pliciUion may be found on i>. 229 of 
Ncwsliolmc’s “Vital Statistics.” ILs rationale and limitations, and the 
formulae for the sciwrate “indirect” standardusations of the male and 




188 Populalio» SMistics and Their CompUation 

female rates, are deduced by II. H. Wolfcmlen in J.R.S.S., LXXXVI, 
408-10 (see footnote at commencement of this Section). 

The method is an approximation only, and is not completely inde¬ 
pendent of the age and sex distributions of the community as is the direct 
method; the validity of its basic assumptions should therefore be tested 
numerically—esiiecially in important cases—prior to its extensive appli¬ 
cation (see Wolfenden, op. ciL, p. 408 and footnote). It should be noted 
also, as {x)intcd out in T.A.S.A., XVIII, 279, that comparisons of in¬ 
directly standardized rates lead to the same conclusions as the method of 
comiMring the ratios of actual to e.xpcrtcd deaths, since formula (74) can 
be expressed as (Crude Rate of Standard) X [(Actual Deaths of Com¬ 
munity) 4- (Kx|)ectcd Deaths of Community by Standard)|. 



X 


rm : forkcasting of aiortality rates 

142. In par. 64 the cxplanalitui and inalhcmalical relationships were 
slated by which the deaths D; in the year of age x to x + 1 , wliich occur 
in the calendar year 5, ari.se from the births of the calendar years 3 - x — 1 
and 3 - X. In order now to keep this time of birth clearly in view, it will 
be convenient sim|)ly to add to 79; the symbol y to identify the time of 
birth, so that the deaths will be more fully visualized by that is to 
say, ^vill tlen«)te the deaths aged x last birthday (i.e., in the year of 
age X to .V + 1) which oci ur in the calendar year 3 and were bom at time y 
(being in fact the calendar years 3 — .r — 1 and z — .r). 

(Correspondingly, will denote the probability of death in the year 
of age X to .Y + 1, as observed in the calendar year z with resided to those 
who were bom at time y (being the calendar years 3 — x - 1 and 3 — .v). 
If, therefore, we hav(» available the <lala of a long series of calendar years 
//, -|- 1 , ;/ + 2 , . . . (but have no data for any earlier years except with 

respect to the births in year — I which are recjuired to determine 
so that the values at all ages have been c*)mj>uted for each of those 
calendar years, a complete tabulation of the results could be presented in 
a tabic of the following form: 


Yf.\R iiF 


(’\I.KNinR Vk\R m 

F OllShRVAIIUN 3 


A<iK 



■- ■ 




.r Ti) V 1-1 

H 

n II 

» 1-2 

n 1 i 

n-l-» 


0 


n 


nbi^jn+3 



1 


n.,n 11 

Vl 

n 1 l,,n 12 

Vl 


n ^ '1 


2 



"Va'® 

n 1 }f.n > .1 

Vl 

ni2qnn 


3 




n.,n -t 3 

Vs 

nl I,,nl4 

Vs 


4 





fi^n+4 



143. From this table the values ut tj with respect to age, year of 
observation, anil year of birth can be followed in three ways: 

(l) By tiikinj; the rows, the values such as "+'93'*, "+* 9 (t^*, ..., 
for age 0 , or " 9 I^*, “" 9 ?+*, ”+- 9 ?'®, .. ., for age 1 , and so on, show the 

189 








190 Population Slatislks ami Their Compilation 

variation for a constant age according to time (the year of observation). 
The births took place in difTerenl years. 

( 2 ) Ky taking the columns^ the values such as 

. . . , for the constant year of observation ;i + 4 show the variation ac¬ 
cording to age for a constant year of observation (time). 'Fhe births again 
took place in different years. 

(3) Jly taking the diagonals, the values "^5, , for the 

constant time of birth n show the variation according to age and lime 
(year of observation). The births here took place in the same years. 

Of these three procedures, method (1) gives in each row an a|)|)arently 
simple view of the variation of the mortality of a given age according to 
time; the time element, however, which is thus implicit in each value 
varies according to both the year of obser\»^ation and the year of birth, so 
that each row .shows in reality the manner in which the y of a particular 
age is influenced by the generation from which it originally arose and also 
by the calendar year in which the deaths occurred. 

Method ( 2 ) is of course the usual pn)ccss of using a fi.xed calendar year 
(or often in practice several consecutive calendar years) as the period of 
observation in which to determine values of t/x for the various ages; but 
although the lime of observation is thus fixed, it will be seen that the 
deaths at the various ages, having been bom at different times, are in fact 
the sur\dvors from different generations. 

Method (3), however, sets out with the i>eople bom only at time n, and 
follows them through successive years of age as they pass through the 
corres|H)nding successive years of observation; and as it thus deals with 
the mortality history of the particular generation bom at time ;/, it has 
been callefl the generation method. 

144. 'fhese concepts are imixirtant when, having ascertained the 
values of we consider their applicability in some future calendar year. 
For, if mortality is continually changing over time, it evidently may be 
advisable to “forecast” the values of r/x to be anticipated in future years, 
before applying them to the long-range prediction of jxjpulations (cf. the 
remarks in par. 63(iV) here), or in other demographic or financial estima¬ 
tions. In order thus to forecast, for examine, a value for in some future 
calendar year n + w, it will be clear from the preceding table that we 
could use (/) the values of yao along the row—in which case extraix)lation 
to the right of the complete table to the year n + m would predict 
r/Jf" from known values of 730 for piist calendar years which have been 
influenced by (a) year of observation and {b) year of birth, although no 
means are available for measuring the relative imiwrtancc of (a) and (ft). 
By {it) proceeding down the columns, no extrapolation for the future 



191 


The Forecasting of Mortality Rates 

year n + w can be made, since the year of observation does not vary in 
the column. Finally, {iii) following the diagonals, the unknown qlf”' 
could be predicted by extrapolation to the southeast of the table. Here 
extrapolation from the diagonal for time of birth n (i.e., the lowest 
diagonal appearing in the table) would give r/5o''’”; from the next higher 
diagonal for the generation bom at time « + 1 an extra|)olation would 
provide qSil 5 next diagonal for;/ + 2 would j^roduce yj,j and so on. 
A value for the si)ecitic year n + w, if desired, would then be determined 
from the series • 

145. Ajiplications of these principles to |X)pulation mortality have been 
considered in a number of i)apers. In his “Mathematical Theory of Popu¬ 
lation” (Api)endix A, 1911 Census of Australia), p. 380, Knibbs ailvo- 
cated the construction of “lluent” life tables which would recognize the 
type of trend shown by method (1). In Britain, at about the same time 
that the problem was receiving close attention with res]x*ct to the fore¬ 
casting of mortality rates for annuitants,* V. P. A. Derrick (“Observa¬ 
tions on (1) l^rrors of Age in the Population Statistics of Kngland and 
Wales, and (2) Changes in Mortality Indicated by the National Records,” 
J.T.A., LVIir, 117) examined the graphs of the national mortality rates 
for a number of age groups over the years from 1846 to 1923, and then 
rejdotted the values by year of birth as in method (3); and since the 
latter were to some extent roughly parallel, the suggestion was raised that 
the year of birth may have an inllueiice ufjon the mortality shown by its 
generation in future years. 

Subsequently W. O. Kermack, A. O. McKendrick, and P. L. McKinlay 
(in two pajiers on “Death Rates in Creat Jiritain and Sweden,” The 
Lancet, 1934, 698, and 'Fhe Journal of Hygiene, XXXIV, 433), again 
supposing that mortality is deixjndent on the attained age and tlie time 
of birth, assumed that mortality rates could be expressed approximately 
as the product of two functions related rcsi)ectively to the attained 
age and the generation, so that could be examined in the form 

* The principal rcfiTiMu*cs In In: coiisulled nil this alliiil suhjccl woulil lie: “The 
Mortality of Annuitants, PX)0-192t),** hy \V. P. Klilerliin and H. J. P. Oakley (The 
Institute of Actuaries); “Fon-casting Mortality,” hy W. P. Kldcrton (Skandinavisk 
Aktuarietidskrift, PW2, 45); “Mortality Kxpericnce of (lovernnicnt Life Annuit.mts, 
19(M)-192(),” by Sir A. W. Watson and II. Weatherill (of which a resume afipears in 
J.r.A., LV, 141); “On the Calculation of Kates of Mortality,” by A. K. iJaviilson 
and A. K. Reid (T.F.A., XI, 183); discussions of the forecasts from the Hritish otTices 
and Hritish Government life annuitants' investigations of 1900-1920 to be found in 
J.I.A., Mil, 21.^, and MV, 4.1; and “A New Mortality Basis for Annuities,” by W. A. 
Jenkins and 10. A. Lew (T.S.A., 1, .W), which includes comments on many of the 
analyses considered in this Section. 



192 


Population Statistics and TMr Compilation 


Q (x)-/?(jc — /) where ()(.v) is a function of the aj^e alone and R{x — /) 
is a generation function based on the mortality / years before at age x — /. 


If the ratios 


V\n z\n 

'^r 


on this hyix)lhesis are then computed from known 


values of q, the age function y(A*) disiippears, and the Ireiiils of llic result¬ 
ing ratios of the /?’s can be used for an arithmetical forecasting of values 
for future years. This method was discussed further by M. Greenwood, 
who commended its .simplicity and its conscc|uent possibilities as a metliod 
for short forecasts (see his “Knglish Death Rates Past, Present, and 
Future,” J.R.S.S., XCTX, 674). Later K. C. Rhodes adopted a logistic 
(see par. 6.?(i7i) here) to represent the generation function R, and apjdied 
the conscf|uent theory with results which were in substantial agreement 
with those reached arithmetically by Kcrmack and Ids co-autliors; 
Rhodes merely claimed, however, that his investigations .showed the 
reasonableness of the hypothesis that mortality might be represented by 
the form () (.v)*R (.v — /), and he s|K*cirically warned against the use c)f 
his results for e.\trai)olation (see his |)aper on “Secular (Changes in Death 
Rates,” J.R.S.S., CIV, IS). 

Logistic curves were als^j illustrated in (Jreenwood’s paper (loc. rit.) as 
a means of repre.scnting, for several ages .v, tin? series of values along 
the rows as in method (1). In another fjaperby If. Cramer and If. Wold 
(“Mortality Variations in Sweden; A Study in Graduation and Forecast¬ 
ing,” Skandinavisk Aktuarielidskrift, 1035, 161), Makeham’s formula 
was employed to represent, by separate fittings, both the mortality rates 
down the columns as in method (2), and those along the diagonals as in 
method (3); the series of Makehain constants so found for the “|)eriod” 
mortality rates of the various columns were then graduated by using 
logistics for log c and j3 and straight lines for a, and the “generation” 
constants derived from the diagonals were graduate!! similarly; ami 
finally the “jKTiod” and “generation” values were used to give foreca.st 
values of /Zx for si.\ decennial ages.* 

These various methods have lieen compared by A. II. Pollard in a 
paper nn “Methods of Forecasting Mortality Using Australian Data” in 
J.I.A., I-XXV, 151, ami comments on many of the analyses consklered in 
this Section miiy be found in the pa|KT by Jenkins and Lew previously 
noted. 


* 111 this jiapcr by ('ramOr and Wold, a valuable list of additional references on 
mortality forecasting is given, and the methods of iltting the Malcehaiii and logistic 
curves (as epitomized in “The Fundamental Principles of Mathematical Statistics,” 
pp. 327 and 331) arc noteworthy. 



XI 


MORTALITY BY CAUSE OF DEATH 

l‘J6. In analyzing stalislical data with respect to cause of death, it 
will be evident that the first essential stej) must be the classification and 
groufnng of the many merlical terms by which the various causes are 
described on the death cerlificales, and the evolution of a practicable 
method for allocation into groui)s according to some standard pattern 
which will afford reasonable com|xirabililics and at the same time will 
deal effectively with the distinctions between primary and contributory 
causes (see par. 22). The complexities of this problem have been dealt 
with by the gradual development of the “Intemational List of Causes of 
Death,” which was originated at the First Statistical Congress in 1853 at 
Brussels (when Dr. William harr and Dr. Marc (rKspine were aj)pointed 
to [)refKire a plan for international agreement), and after several revisions 
was presented to the International Statistical Institute in 1893 by Dr. 
Jac(jucs Bertillon. 'Fhe List has since been revised about every ten years 
at international conferences in Paris in order to give effect to advances in 
medical practii'o and changes in terminology. It was first used by a few 
countries in 1893, and has now been adopted oiVicially by almost all 
countries which maintain national statistical ollices. 

The latest revision the sixth which was adopted at Paris in 1948 
and in the same year was recommended and published by the World 
Health Organization in (ieneva, nnnle an imix)rtanl chaiigc by enlarging 
the former list of causes of death into a comprehensive “International 
Statistical Classification of Diseases, Injuries, and (.'auses of Death,” and 
thus for the first time |)rovided a uniform basis for the classification of 
both mortality and morbidity statistics (see also ])ar. 182 here). In the 
Unileil States the National Olllce of Vital Statistics commcmced to use 
this revision on January 1, 1949, tog<‘tlier with the new death certificate 
(par. 22) which also follows the international recommendation of the 
World Health Assembly. 'Fhe classification has 612 categories of diseases 
and morbid conditions, 153 categories fi)r external causes of injury, and 
189 categories covering injuries according to the nature of the lesion. An 
abridgerl “Inlermetliatc List” of 150 causes of morbidity and mortality 
was also published for age and other iM>pulation studies, and an “Abbre¬ 
viated List” of SO causes of death was proiK)sed for tabulations of mor¬ 
tality and for the purposes of public health administration. The two- 

193 



194 Populaliou Statistics and Their Compilation 

volume Manual covering this revision includes rules for classification and 
an alphabetical index of medical terms for coding medical records and 
death certifications. 

'rhese Sixth Revision lists thus now replace the several versions of the 
“Detailed International List,” the “Abridged International List,” and 
“The Short List of the Registrar-General” of Kngland and Wales to 
which many references will be found in the literature prior to 1949. The 
periodical changes which have occurred in the various revisions have 
produced dilliculties, of course, in the attempt to com|)ute statistics 
wliich would be comparable over successive decennia; in the Rogistrar- 
Generars “Statistical Review of Kngland and Wales for the Six Years 
1940 1945,” for example, the Kifth Revision of 1938 had produced such 
disturbances that dual talmlations of deaths were made according to the 
new and old procedures, and conversion factors were calculated for ap¬ 
plication to the deaths and death-rates prior to 1939 in order to make 
them comf)arablc to tabulations by the new classification. Special studies 
of the effects of changing to the new list when a revision is made have also 
been undertaken by the U.S. Bureau of the Census (sec “Vital Statistics 
Rates in the United States, 1900 -1940,” by F. E. Linder and R. D. Grove, 
pp. 18-26). The conference on the Sixth Revision alscj recommended that 
deaths in 1949 or 1950 should be coded according to both the sixth and 
fifth revisions, with dual tabulations to indicate the character and extent 
of the changes. 

147. One of the most difficult problems encountered in classifying 
deaths by cause arises from the essential statements on the death certifi¬ 
cates with respect to “immediate” and “contributory (secondary)” 
causes, whence the primary cause under which each death shall be 
tabulated must be determined for statistical puny)scs. (11ic terms just 
stated are those use<l, with variations, on the death certificates of Great 
Britain, Canada, and the United States prior to the recommendations of 
the sixth revision conference.) 'Fhc main objective is to place upon the 
physician the responsibility for stating the causes of death in such a man¬ 
ner that the primary cause can thence be determined satisfactorily and 
the deaths tabulated accordingly. 'Fhe imiwrtance and complexity of the 
problem may be judged from the fact that, in countries where medical 
practice is most advanced, from 33% to about 70%) of all death certificates 
show multiple causes (see II. L. Dunn, “The Evaluation of the Effect 
upon Mortality Statistics of the Selection of the Primary Cause of 
Death,” J.A.S.A., XXXI, 116; the U.S. Census Bureau’s Vital Statistics 
Special Report No. 47, Vol. V; and Linder and Grove, op. cit., pp. 21-23). 



195 


Mortiilily by Cause of Death 

It must be remembered, also, that the dctcrmiiiatioii of the underlying 
cause amongst a number of possibly conlributory causes is often a 
baffling problem for even an experienced physician, and that this dilliculty 
is not lessened by the desires of registration ollicials to lit the ultimate 
decision into prescribed classes for statistical tabulati*)ns. 

In Jiritain the primary cause for tabulation was settled by Ilexible rules 
until 1939; in 1940, however, rules were sii|KTseded by a new plan of 
selection based on acceptance of the certifying physician’s opinion as to 
the underlying cause. In the United States between 1914 and 1949 the 
primary cause was determined from the death certificates then in use by 
a set of “priority tables,” founded on general rules and many spci ial 
decisions, which were jiublished by the Census bureau in several editions 
of the “Alanual of Joint Causes of Heath”- this inllexihle system having 
the advantages of consistency and uniformity, but the weakness tliat it 
may disregard the jihysiciaii’s own imlication as to the primary cause. 

'rhe British plan of accejiting the fihysician’s statennmt of the under¬ 
lying cause led to its recommendation by the Sixth Revision conference 
and its adoption by the World Health Assembly. In accordance with 
those resolutions, therefore, the new 1949 United States death certilu*ate 
(sec par. 22) calls for (/) the ilisease or condition leading directly to 
death, (//) antecedent causes, and (///) other significant conditions, d'he 
intention of this phraseology is that the physician, as in Britain, shall thus 
show clearly his own determination of the underlying cause (under which 
the death would be tabulated statistically). The idiysii ian in this system 
consequently has “both a heavy res|)*)nsibility and a great op|)orlunity to 
make mortality statistics rellect the true frequencies c)f the underlying 
causes of death” (see the “Physicians’ Handbook on Death and Birth 
Registration,” lOth Kdition, 1949, and also the 9th Kditif)n of 1939 for 
the practices prior to 1949 and further comments on the advantages of 
the new |)lan). 

\\ ith the I'ooperation of ])hysicians in thus staling the underlying cause 
on the new certificate, it is anticipated that marked im|)rovement will be 
seen in the reliability of the data and the facility with which they can be 
tabulated. Public health authorities also c-\])ect that the informal ion thus 
secured will be more useful in the control and prevention of some of the 
initial causes. 

148. In the periodical rqwrts on vital statistics issued by government 
offices it is customary for practical puri)Oses to state the death late from 
any cause as the ratio of the deaths from that cause, by age groups and 



196 Population Statislks and Their Compilation 

sex, to the enumerated ix)i>ulation for the area concerned, and then to 
institute comparisons between the rates so calculated.* 

Crude rates, which do not take differing age and sex distributions into 
account, will of course be unsatisfactory, as explained in par. (and 
T.A.S.A., XVllI, 275). In order to have a comprehensive method which 
will give a single mortality figure for each cause in each community and at 
the same time will be indeiiendent of the varying age and sex distribu¬ 
tions, the process of <lircct standardization is frequently used (cf. the re- 
j)orts of the Registrar-Ceneral of England and Wales, ami of the National 
Oflice of Vital Statistics previously in the Hiireau of the Census in the 
United Slates, and J.I.A., XXXVl, 122-26). 

149. Mortality tables analyzed by causes (or groups of cau.scs) of death 
are sometimes constructed, l^xlensive tables of this kind were published 
by E. B. Nathan in T.F.A., X, 45, on the basis of the 1911 census of Eng¬ 
land and Wales and the deaths of 1911-12 for 20 main grf)U|)S of causes. 
Similar calculations from the 1950 census of the United States and the 
deaths of that year have been made by L. I. Dublin, Iv \\\ Kopf, and 
A. J. T.olka (published in part in the American Journal of Hygiene, VFT, 
299). Analyzed tables based on the 1910 U.S. census and the deaths of 
1939 41 have been prepared jointly by the Statistical Bureau of the 
Metropolitan Life Insurance Company and the National (Ellice of Vital 
Statistics. 

The construction of such analyzed tables has been discussed by 

H. Wyss (in Milteilungen der Vereinigiing Schweizerischer Versicherungs- 
mathematiker. No. 22 [1927], 111) and M. N. Karn (Annals of Eugenics, 
IV, 279, and Biometrika, XXV, 91), and more fully by T. N. hi. Clrcville 

* For medical purposes the proportion of ilealhs to persons attacked hy a jKirticular 
disease is often used to indicate the degree of virulence, although it will give misleading 
results if the communities compare«l have flissiiiular age and si*k constitutions. 

Another measure employed hy wiine writers to estimate the relative magnitudes of 
the various causes «if death is the pro])ortioii of deaths for each (or any) cause to the 
total deaths from all causes. I’his procedure, hfiwever, may he quite fallacious as a 
means of comparing the mortalities hy cause in dilTi rent communities (or even in the 
same community at different times), as may he seen clearly from the hillowing illus¬ 
tration by Or. Kansome: ^'Suppose a town of 1(X),(XI() with 2,(XX) annual deaths of which 
500 arc caused hy phthisis. Here the general death-rate is 20 per 1,()(X); the death-rate 
from phthisis is 5 per 1,(XX) living and the deaths fn»m phthisis f«)rm one-fourth of the 
total deaths. In another town having the same population the total deaths are -btXX), 
and therefore the death-rate is 40 per 1,(XX) inhabitants; the deaths from phthisis are 

I, 000 and therefore the death-rate from phthisis is 10 per l,tXX); but the pniportion of 
the phthisical to the total mortality is one-fourth as before. In the second town, there¬ 
fore, there is by the latter test ajiparently no worse condition, so far as phthisis is con¬ 
cerned, than in the first, though matters arc really twice as bad” (Newsholme’s “Vital 
Statistics,” 5rd Fdn., pp. 186-87; and cf. the second footnote to ]}ar. 82 here). 



197 


Mortality by Cause of Death 

in R.A.I.A., XXXVTI, 283. Grcvillc considers particularly the methods of 
preparing analyzed tables from data in 5-year age grou|)s (cf. the prin¬ 
ciples stated in par. 129 here). Adopting the suggestion of Dublin, Kopf, 
and Jx)tka he proposes for w-year groups between ages .v and .v 1- n the 
formula 



where D represents the actual «leath statistics and <7 the deaths in the 
mortality table, and i indicates the /Ih cause or group of causes of 
death.* lie shows that this forituila, while not strictly correct, may be 
expected to give very accurate results. Summing ,,^7^. from age .r to the 
end of the mortality tables gives values at corres|)onding intervals of 7j, 
namely, the number of sur\'iv<)rs at age x who will eventually die from 
the ith cause. 

150. The i)ref)aration of special mortality tables showing what the 
effect would be if a particular cause of death were eliminated cimiplctely 
has also received attention. I'or this purf>ose it is usual to assume! that the 
various causes of death operate independently of each other. While this 
assumption may properly be criticized, it is not |)racticable to evaluate 
the degree of dci)endencc between different causes of death, and the re¬ 
sults obtained by its use are of considerable interest. .\s first pointed out 
by W. M. Makeham (J.f.A., XIII, 329, and XVIIl, 317; see also If. If. 
Wolfenden’s explanation in 'r..\.S.A., XL! 11, 272 74), this assumption 
of independence gives rise to a law of composition of decremental forces 
under which the total force of mortality is the sum of the separate forces 
of mortality from the various causes of death.t It follows that the prob¬ 
abilities of survival must satisfy'' the relation 

nPx nPx npx 

where the ( -'/) refers to a mortality table in which all causes of death 
except the zth cause are <»perating, and the (/) refers to a table in which 
the zth cause alone is ojierating. Obviously npx ~'^ computed from 

the formula just given if the values of are known. 

* This superscript / for cause «>f *le:ith is list'd here so that (Ireville’s R.A.l.A. paper 
may he followed easily; it must iiol he confused, of course, wilh the superscript z which 
is employed throuKlmul this Study to denote the cidentlar year (sec par. 6-1). 

t The theory of multiple dccrenicnt tahles can he developed frfim the same point of 
view. This is brought out hy Wolfenden in the reference cited, anil also siihsetjuenlly 
hy C. J. Xeshitt and L. Van Kenam in K.A.I.A., XXVII, 202, and hy W. (1. Hailey 
and H. W. Haycocks in their recent j)aini»lilct “Some 'riieoi^etical Aspects of Multiple 
Decrement Tables.” 




198 


Population Statistics and Their Compilation 


Tn making this compulation by single years of age, Greville considered 
it satisfactory to assume (following a methoil used by Dublin and T^tka 
-. -see R.A.T.A., XXXVII, 292) tliat (/i*' is equal to namely, tlie prob¬ 
ability in the complete analyzed mortality table that a survivor at age .v 
will die from the ith cause during the succeeding year. For ilata by S-year 
age groups, however, this method was not sufTiciently accurate; accord¬ 
ingly he develo|)ed the formula 


1 n-l^x I . 

colog npx =- yj— colog npx 


which has the advantage that the multiplicatum rule for probabilities of 
survival is satisfied automatically wlicn upx~'^ npx'^ sire determined 
analogously. 

Tn order to compute values of the cx|M?ctation of life in the sjiecial 
mortality table from which the ith cause has been eliminated, Greville 
suggested calculating the required //s by tlic formula 


where ' arc the deallis in tlie complete analyzed mortality table from 
all causes except the ith cause. He showed that, although the values of 
sire consistently overstated to a very slight extent, the resulting 
values of the expectation of life are not affected a[)i)recial)ly. 

151. Reference has already been made in par. 118 to the dellcicncics 
of crude death rales by cause as an cjver-all index of the relative im|)ortance 
of different causes of death. In public liealth achninistration it is often de¬ 
sirable to have an index which will indicate pr*)|>erly the relative im¬ 
portance of the different causes. I'.ven standardized rates may be criticized 
on the ground that they do not rcllcct the greater loss to society in the 
death of a person in the |irimc of life as compared with that of an elderly 
person who in any case would have lived only a few years. T'or this reason 
certain measures based on “ix)tential years of life lost” have recently been 
proposed. Fcjr example, one might take the deatlis occurring in a certain 
calendar year from a given cause and multiply the deaths at each age by 
the expectation of life at that age. The sum of these products might be 
regarded as the aggregate ix)tential years of life lost as a result of the 
operation of the given cause of death. Theoretically, the ex[K‘ctations of 
life used in this computation should be based on a special mortality table 
from which the given cause of death has been eliminated; F. G. iTickinson 
and 1^, L. Welker, however (in bulletin 64, Bureau of Medical lOconomic 
Research of the American Medical Association, entitled “What Is tlie 



Mortality by Cause of Death 199 

Leading Cause of Deatli? Two New Measures”) have concluded that for 
comparative purjxjses it is sulliciently accurate to use the expectations of 
life from a general mortality table.* \V. Ilaenszcl (American Journal of 
Public Health, XL, 17) has suggested a similar measure in which, instead 
of using expectations of life, the deaths at each age are simply miilli|)lied 
by the difference between that age and a lixed age such as 65, 70, or 75 . 
As compared with the index using expectations of life, this proposal has 
the advantage of providing a uniform standard of comparison for different 
countries, social classes, or time periods. 

* fn this paper aHisicIcriUion is also Kivon to a second measure, based on temporary 
c-xpectalions of life to ago to only with the ohjecl of estimating “the potential years 
of working*life lust.” 



XII 


OCCUPATIONAL MORTALITY 

152. Since the establishment of efficient registration systems in various 
countries, numerous investigations have been made with the object of 
estimating as closely as possible the mortalities in dilTerent occupations, 
and especially the particular hazards in each occupational group iis they 
may be indicated by analyses of tlie causes of death. For England and 
Wales occupational mortality studies were made by Dr. Farr from the 
census data of 1861 and 1871 and the deaths of 1860 61 and 1871; Dr. 
Ogle considered the census material of 1881 and the deaths of 1880 82; 
Dr. Tatham based two examinations on the censuses of 1891 and 1901 
and the deaths of 1890-92 and l‘^KI-1902 respectively; these periodical 
investigations were continued in the two Registrar-Cieiierars Decennial 
Sui)plemenls on mortality in certain occupations as indicated by the 
censuses of 1911 and 1921 and the deaths of 1910 12 and 1921-2.i which 
were completed under the guitlancc of Dr. T. 11. C. Stevenson; and the 
data of the 1951 census and tl)e deaths of 1950'32 have been analyzed in 
the Decennial Supplement prci)ared under Dr. Percy Stocks (see the 
various official reports here noted; T.A.S.A., XXIX, 551; and P. Stocks, 
J.R.S.S., Cr, 669). For Scotland similar analyses were fniblished by Dr. 
Blair-Cunynghame and Dr. Dunlop from the censuses of 1891 and 1901, 
and other studies have also been made on the Continent of Europe (see 
T.F.A., V, 1, and J.I.A., LV, 266). 

155. As already mentioned in par. 49, certain difficulties (in addition to 
the usual errors of age, etc., affecting all statistics of the general jwpula- 
tion) arc encountered in the cidculation and comparison of occupational 
mortality rates. 'Fhese difficulties are involved mainly in (/) the methods 
of classifying the occupations; (//) discrepancies between the occupational 
designations on the census schedules and on the death certificates; (Hi) 
the determination of a satisfactory |)opulation at risk in each occupational 
group; (iv) the relation of physical fitness, environment, social-economic 
status, and other factors to the nature of the occupation; and (v) the 
method of comparing the rates of mortality of different occupations. 

(i) The evolution of a method of recording and classifying occupations 
which will satisfy the varied requirements of governments, industrialists, 
sociologists, actuaries, statisticians, and others who use the resulting 
statistics in different forms has always been a problem of great difficulty, 

?00 



Ocaipational Mortality 201 

which has received much attention in the various census offices and else- 
whcrc. In order to achieve many desired purix)ses the data should indicate 
at least (a) the worker’s position in his industry, i.e., whether an om- 
j)loycr, an employee, or working on his own account, (b) his skill and 
intelligence, (r) the special service's rendered or processes y)erformed, and 
(J) the healthfulness and other important features of the occupation. 
However, an extremely complicated classitication would he necessary to 
rellect all those influences in any manner which would i)erinit of subse¬ 
quent statistical analyses, and one of the accompanying disadvantages 
would be that in some of the numerous subdivisions the material would 
be very small, 'riie type of classification now generally adoj)ted in prac¬ 
tice, therefore, is an occupational classification with an iiulustrial frame¬ 
work, since many occupations mean little when they are considered apart 
from the industries in which they are pursued. 

In (Ireaf Hritain the classification prior to the 1^21 census had been 
mainly on an industrial basis (so that a clerk in a textile company would 
be included in the textile industr}-)* In V)21, however, the tabulations 
were made by industry and by occupation, in acconlance with a resolution 
of the British Empire Statistical Conference that the basic [)rinciple of 
tlie industrial classification should be the “product” or “ly|)e of service,” 
and that of the occupational chissification the “process carried out” and 
the “material worked in,” with the primary classification in the latter 
case according to the “material worked in” so as to avoid, for example, 
"the consolidation of picklers of onions and picklers of metals which 
might result from a primary classi heat ion by process” (cf. 'r.A.S./\., 
XXIX, .kll). Tor the 1*151 census the terms “industry” and "occupation” 
were defined very carefully in the "(.Classification of Occupations” pub¬ 
lished by the (General Register Office. Tn the United Stales, similarly, the 
({uestions on the yxipulation schedule were enlarged in 1*>1() so that each 
person was required to state (1) “'Frade or profession of or i)artiLular 
work done by this person, as spinner, .salesman, laborer, etc.”; (2) “(len- 
eral nature of the industry, business, or establishment in which this j)erson 
works, as cotton-mill, dry-goods store, farm, etc.”; and (5) “Whether em¬ 
ployer, employee, or working on own account.” 'Fliese enf|uiries have been 
developed until in the 1950 U.S. census (in addition to questions as to 
being employed or unemployed, and hours of work) every person was 
asked (1) "What kind of work was he doing, for example, nails heels on 
shoes, chemistry professor, farmer, farm helper, etc.”; (2) "What kiial of 
busine.ss or industry was he working in, for example, shoe factory, state 
university, farm, etc.”; (5) “Class of worker, i.e, for private employer, 
for Government, in own bu.siness, or without pay on family farm or 




202 PopiihUion Statistics ami Their Compilation 

business.” In the Canadiiin census of 1941, the question rcsixicting the 
occupation similarly asked for (1) “Trade or profession,” etc.; the (jnery 
with regard to the industry, however, was divided into (2) “Kind of 
product or service, as for example rubber shoes, clriigs, elc.,” and (,?) 
“Branch of industry, as for example manufacturing, retail trade, elc.”; 
and status was covered by a ciueslion as to (4) “J^mployer, own account, 
W'age-eamcr, or unpaid family worker.” The wording of the ciucstions in 
the 1951 Canadian census was (1) “Occupation (what kind of work did 
this person do in this industry? For exami)le, olTicc clerk, sales clerk, auto 
mechanic, iron moulder, graduate nurse, etc.)”; (2) “Industry (what kind 
of business or industry is this? For example, rubber shoes manufacturing, 
drugs retail trade, grain farming, etc.)”; ami class of worker was to be 
described as “wage or salary earner, employer, own account, or no |\'y.” 

By such questions it is theoretically iiossil)le to obtain the numbers in 
each specific elementary occupation in each industry or ser\nce group (i f. 
the U.S. Census Bureau’s Classified Index of Occupations, and its Alpha¬ 
betical Index of Occupations iind Industries; in this connection note also 
the League of Nations’ classification for the gainfull}' occupied |)opulation 
which was recommended by the Inter-American Statistical Institute for 
its proix)scd 1950 hemisphcral census -see Kstadfstica, March, 1945, 
p. 11). In tabulating the results of such enquiries, however, it is often 
difficult to follow a detailed classification by occupation in each industry 
(as had been attempted in Vol. IV, Occupation Statistics, of the 1910 
U.S. Census), on account of ignorance or indifference on the part of the 
individuals enumerated and the carelessness of the enumerators. The re¬ 
sults consequently now’ are often ]niblisherl “by gr()U])ing togetlier all the 
workers in each separate occu|>ation without regard to tlie dilTerent in¬ 
dustries in which the occiif)ation is pursued” (see Vol. IV, 1920 U.S. 
Census), in order to reach finally a classilication for statistical purposes 
wliich shall be as nearly as possible occupational rather than industrial. 

(ii) "J'lie discrepancies which frequently occur ])etween the occupa¬ 
tional designations used by the individual and the enumerator on the 
census schedule, and those given by relatives on the death certificate, 
cause much trouble in the attempt to produce reliable occupational mor¬ 
tality statistics from the general population. Thh has proved to be the 
case despite efforts to elicit a statement of occupation on the death 
certificate in conformity wdth the occupational groups of the census 
classifications (cf. par. 49 here; T.A.S.A., XIX, 136 and 140; J.I.A., LV, 
267; P. Stocks, J.R.S.S., Cl, 707; and 11. L. Dunn, J.A.S.A., XXXV, S9). 

(m) In attempting to determine a satisfactory j^pulation at risk for 
each occupational group it must be remembered that the numbers 



Occupational Mortality 203 

enumerated in cacli occupation at a census necessarily include only those 
who were then so occupied. Some censuses, therefore, have asked ques¬ 
tions of everyone with regard to the previous occupation or the occupa¬ 
tion at the preceding census. The recent introduction of additional ques¬ 
tions for a sami)le of the |)opulation, moreover, has provided op|^)rtuni- 
ties for sampling queries as to kind of work, kind of business t)r industry, 
and class of worker in respect of the last job done by those who were not 
occuf)ied at the date of the census (cf. par. 12). 

The material so available has an important bearing on the basis to be 
adopted for the determination of the i^pulalion at risk. Mortality rates, 
for example, computed only from the numbers occn|)ied at the census 
date take no account of those who were formerly orcui)icd and may have 
l)een forced to retire by reason of ill-health or disablcMiient directly caused 
by the occupation; consequently they would understate the true occupa¬ 
tional morta.lily in the more dangerous trades. In the occupational study 
based on the t‘)01 census of I'ngland and Wales and the deaths of !<)()()- 
1<X)2, therefore, those reported as being “unoccu]>ied” or “retired” were 
classified also according to their previous occupations, so that for each of 
the age-groups 25 dl, 35 ‘tt, 45 54, and 55 (>4 (comprising the main 
working years of life) the mortality rates were based firstly on the “oc¬ 
cupied,” and secondly on those “occupied and retired.” In the cor- 
res[)onding report on the 1921 census the age limits of 25 and 65 were 
extended to 20 and 65 because it was fell that even at age 20 the average 
worker has been subjected to the environment of his f)ccii|)alion sufli- 
cieiitly long that his mortality may be innuenced by it. 'I'lie data were 
also examined only for ages 35 to 65 (as well as for ages 20 U) 65) in the 
report on the 1931 census, because to some extent objections may be 
raised to the inclusion of the earlier ages in attempts to assess the effects 
of occupation on mortality. I'he inclusion of the retired with the occupied 
is not, of course, an ideal basis, because the retired arc seclude*! from 
the special ha/anls of their hirrner occui)ations, and the statistics of the 
retired are not so reliable as those of the (jccu|)icd; but it gives probabl}'^ 
a more satisfactory basis between age's 20, 25, or 35 an*l 65 than the oc¬ 
cupied alone, since it lakes into account the mortality *)f tlif)se wh*^ have 
been forced to retire as a result of the hazards of the occui)ation (see Part 
T1 of the Supplement to the 65th Rep*)rt, and the 1921 Decennial Supple¬ 
ment, of the Kegislrar-(General of Kngland and \\ ales; J.I.A., XLHl, 231; 
andT.A.S.A.,XVirr, 269). 

(iv) Tn the preceding discusskm of the basis to be used for the popula¬ 
tions at risk in each occupational gn»up, physical litness anrl environment 
were mentioned incidentally as being two of the most im|^)rtant factors 



204 Population Statistics and Their Compilation 

which evidently must influence occupational mortality. Preliminary selec¬ 
tion naturally induces the i)h)rsically unfit to take up the less exactinj; 
occupations, .and subsequent re-selections continually force those who arc 
eng.aged in strenuous callings to abandon such occupations on the occur¬ 
rence of some comparatively slight illness or mishap. The result is that in 
general the computed mortalities of the more exacting occupations will be 
too low, while the mortalities of those employments not requiring such 
complete iitness- into which those becoming unfitted {nr tlie more strenu¬ 
ous callings will be likely to drift - will be exaggerated. 

Any attcmj)t to measure effectively the influences of environment, such 
as locality, density of iK)pulation, type of housing, income level, standard 
of living, etc., uptin the mortalities of clifferent occu])ations is, of course, 
attended by formidable difficulties because of the imixjssibility of dis¬ 
entangling the numerous and related causes. Since 1911, however, the 
Hritish re|K)rts have devclojHjd an approach by classifying the occuiiations 
according to social-economic status in five bro.ad groups, covering re- 
si)ec‘tively the upper and middle classes (f)rofessional, etc.), intermediate 
classes, skilled workers, the partly skilled, and unskilled workers. Further¬ 
more, in order to employ a statistical method (as far as may be practicable 
in so comy)lex a problem) for measuring the social-economic influences and 
thus eliminating them from the general picture of occupational mor¬ 
tality, so that the direct effect of an occupation as a separate factor may 
then be isolated, the Registrar-General’s 19.M Supplement (sec also 
P. Slocks, “The Kffccts of Occu|>ation and of its Accomj^anying Fn- 
vironment on Mortality,” J.R.S.S., Cl, f)69) used in each case the wives 
of the occuyiied men as a “control group” (subject only to the social- 
economic influences, but not to the direct effects of the occupation), and 
thereby estimated the men’s occujxitional risk by causes of death (.as well 
as, incidentally, the influence of social-economic factors on the women). 

(r) For comparing the rates of mortality of different occuy)ations, the 
principle of direct standardization (see pars. 139 40) has been used for 
m.any years by the Registrar-General of Knglaiul and Wales in the cal¬ 
culation of a Comparative Mortality Figure for each occupation. The 
staiid.ard population used in computing this figure is taken only from age 
20 (previously 25) to 65 in order to cover apf)roximately the effective 
working years, .and is reduced (to facilitate comparisons on a per millc 
b.asis) so that the standard mortality rates therein would proiluce 1,000 
deaths. It must be noted, however (in contrast with the reliability which 
is to be anticipated, as stated in par. 139, from standardizations of the 
death rates of different communities at all .ages and from all causes, when 
no subdivisions by occupation are made), that distortions may sometimes 



Occupational Mortality 205 

result in comparing these standardized “comijarativc mortality figures” 
for various occupational subdivisions, because the age distributions in 
some occupations are sharifiy abnormal and arc wholly different frf)m 
those of any typical standard iwpulation. The Registrar-General conse¬ 
quently added comirarlsons of actual and expected deaths in the 1021 and 
1931 reports (see the review in T.A.S.A., XXIX, 331, and J.I.A., LIX, 
144 and discussion). 

A method of standardization in which the standard iwjrulation is com¬ 
posed arbitrarily of exactly the same number in every age group has also 
been siiggcstcrl by Yule for dealing with comparisons of occupational 
mortalities (see G. U. Yule, “On Some Points Relating to Vital Statistics, 
More Ksjiecially Statistics of OcciqKitional Mortality,” J.R.S.S., XCVII, 
1, and comments by Linder and Grove, op. cit., pp. 81-83). 'I’he effect of 
this eriual weighting is obviously the same as would be obtained by taking 
simply the arithmetic mean of the ^’s for the various age gFou]rs. 



XTTT 


TIIK USE 01^^ CENSUS AND REGISTRATION DATA IN 
THE COMPILATION OF STATISTICS RELATING TO 
MARRIAGES, BIRTHS, ORPHANHOOD, UNEMPLOY¬ 
MENT, ETC. 

154. preceding |)agcs have dealt primarily with the employment 
of statistics of the general iK)])ulation as a basis for mortality tables. 
Many other important enquiries in which actuaries are interested, how¬ 
ever, may be founded on census and registration material, for such data 
comprise information relating to a wide range of sociological problems. 
In particular, actuarial estimates of the coverages and ('osts of social 
insurance and social welfare plans of many different kinds are ncccssJirily 
based uj)on statistics of the mass of iK'oplc to w'hom such schemes are to 
be applied. 

'riie principal sections into which these further problems may be sub¬ 
divided are those concerning: (1) Marriages; (2) Births ami Fertility; (3) 
Dependency and Orphanhood; and (4) Unemployment, 

(7) Marriages 

155. Rates of Marriage.- In [xirs. 16--25 a brief description was given 
of the situation in various countries with regard to the registration of 
marriages and tabulations of the resulting statistics. When reliable data 
are available, rates of marriage are of course most satisfactorily examined 
by calculating the probability or central rate of marriage at each age or 
age-group separate compilations being made for bachelors, spinsters, 
widowers, and widows (and divorccil, where available). If a compre¬ 
hensive figure is desired for the comparison of such rates for different 
communities and times, they may be standardized for differences in age 
distribution (on the prineij^es already stated in pars. 139 41 for dealing 
with mortality statistics). 

Such rates, how'ever, whether by ages or standardized, are not always 
easy to calculate in years other than the census years, because then they 
may involve the estimation of the intercensal jx)pulations in each mar¬ 
riageable class. Crude rates, therefore, are often employed in re[X)rts on 
vital statistics, by taking (a) the ratio of the number of marriages in each 
marriageable class to the corresponding number of bachelors, spinsters, 
widowers, widows, or divorced persons, at all ages or at marriageable ages 

?06 



Stalistirs Relating to Marriages^ Births^ Orphanhood, etc. 207 

over 15 or 18, or by using (6) Ihe still more comi^osile number of marriages 
per 1,000 living, of all classes of the population at all ages. Such crude 
rates, of course, may again bc'niisleading (cf. par. 158), although for some 
deniograjdiic f)uri)oscs they may give a sullicienlly reliable indication of 
the general trend of marriage rates from year to year. 

156. Topulation statistics compiled by the U.S. Census Hureau liave 
also been used in the construction of double decrement mortality and 
marriage tables for single men and single women by W. H. Crabill (‘‘At¬ 
trition Life Tables for the Single Population,” J.A.S.A., XL, 564). Mar¬ 
riage rates for single males under 25 and single females under 22 were 
based provisionally on the 1940 census data (since the marriage tran¬ 
scripts were delicient in number at ages under the legal age of consent, 
and were too numerous in the early twenties on ai count of misstatements 
of age); at higher ages provisional rates were derived from the 1940 
marriage data for 25 States (as given in Vital Statistics Special Reports, 
Vol. 17, No. 9). Pecause the improved economic conditions of 19-W), and 
marriages which hail been deferred from the ])revious de])rcssion, had 
produced an exceptional number of marriages in 1940, these provisional 
rates were modilied so that they would be consistent with the number 
which would have been anticipated in 1940 for the iK)pulalion at mar¬ 
riageable ages and the average annual marriage rates during 1920 1959. 
'fhe mortality basis for single i)ersons was derived at ages over 20 from 
the Census Bureau’s “Alortality by Marital Status, 1940” (Vital Sta¬ 
tistics S|)ecial UeiKirts, Vol. 25, No. 2), and at younger ages from the 
1959 41 U.S. Life Tables without regard to marital condition. 

157. Marital Status, and Relative .l^c.v oj Husbands and Wires, in 
connection with the calculation of probabilities or rales of marriage at 
each age (or agc-grou|)), it is of course often necessary to use tabulations 
of the proportions of the census |)opulation according to marital condi¬ 
tion, which arc obtainable readily from the schedules. Slati-stics of the 
relative ages of husbands and wives, which again can be derived from 
tabulations of the data recorded on the census schedules, arc also fre¬ 
quently of value for other puqx)ses (cf. T.A.S.A., XVIII, 266, 281, and 
284; and J.T.A., LIT, 57, and LVI, 182^0- 

158. Mortality According to Marital Status. -When census data arc 
tabulated according to marital statu.s, rates of mortality for each such 
class can be computed if the death certificates similarly record the marital 
condition at death (as on the certificate illustrated in par. 22) and if the 
data thus available arc tabulated. An illustration of such material is the 
U.S. Census Bureau’s Special Report on mortality by marital status al¬ 
ready mentioned in par. 156. 



208 Population StcUistics and Their Compilation 

In Great Britain the mortality rates for spinsters, married women, and 
widows have been given in the rejjorts of George King on the 1911 census 
and the deaths of 1910 12, and by Sir Alfred W atson in the corresj)oiiding 
1921 and 1931 rejwrts (see the references in par. 120 of this Study). King 
also constructed separate life tables for each class; W'atscm, however, re¬ 
frained from doing so, on the ground that each marital group is subject to 
increments and decrements (such Jis the married women group being in¬ 
creased by spinsters marrying and decreased by husbands dying) which 
disturb the validity of the life table as a concept for fle|)icling the mor¬ 
tality exi^erience of a cohort of supposedly homogeneous lives. 

159. Rates of Widowhood. Jn order to Ciilculate rates of widowhood, 
i.e., the profxirtions of married women who, in each year of age (or in 
each year of age by duration of marriage), are left as widows by the 
deaths of their husbands, it would be necessary for the death certificate of 
a married mjin to record tlie age of his wife (and tlie date of their mar¬ 
riage, or its duration). As these items are not stated on the death certifi¬ 
cates, estimated rales can be prepared indirectly by assuming mortality 
rates for married men in conjunction with tabulations of the relative ages 
of husbands and wives, or an approximation can be reached in some cases 
by using the average (weighted) age of the husbands or more accurately 
from the weighted means of their mortality rates (see T.A.S.A., XVIII, 
282, and J.I.A., XLV, 421). 

160. A complete treatment of the character and technical utilization 
of the marriage statistics derivable from iKjpulation data docs not fall 
within the scope of this Study. The discussion here must therefore be 
limited to the indications stated in pars. 155-59. It may be emphasized, 
nevertheless, that various types of marriage statistics, especially when 
taken in conjunction with material respecting fertility, are obviously of 
major imjxirtance in many investigations into the sociological and eco¬ 
nomic implications of population growth or decline (such as those which 
recently have engaged the attention of the Royal Commission on I’opula- 
tion in Great Britain, where anxieties had arisen with regard to the trends 
in fertility, size of families, and total population). 

(2) Births and Fertility 

161. 'Fhe principal measures of basic actuarial importance in connec¬ 
tion with statistics of births and the analysis of fertility are the birth 
rates according to age, sex, and marital condition of the parent, i.e., 
more precisely, the probabilities or rates of issue to per.sons (either married 
or still single) of each sex and age. In the presentation of such rates it is 



Statistics Relating to Marriages, Births, Orphanhood, etc. 209 

desirable to distinguLsli between single and plural births, and to give 
separate tables for those w'hich are legitimate and illegitimate.* 

When birth registrations are sufficiently reliable the most salisfactoiy 
method would be to construct, from such registrations and the cor¬ 
responding tabulations of parents, complete tables showing the ages of 
the parents at the births of their children (subdiviiled by number of 
children at a birth, and according to legitimacy), and thence to compute 
the probabilities or rates of issue to all the potential parents (or to all 
males or females) of ages and sexes corrcs|x>nding to those who were 
tabulated as having children. A valuable discussion of the compilation of 
tables of this description has been given by C. H. Wickens in Appendix 
B, 1911 Census of Australia (“On the Materials for, and the Construction 
of. Tables of Natality, Issue, and Orphanhood”), where extensive data 
which had been available in Australia for many years [xist were used, and 
tables were included showing (1) the ratio of the legitimate births per 
annum to the mean number of males of the same age as the fathers, (2) the 
average number of chihlren {wr confinement, by age of father, and also 
(.1) the surviving legitimate children under one year of age, corresijonding 
to males surv'iving out of 100,(KX) at age 14, and (4) the children under IS, 
who either have their fathers living or arc orphans, corresjionding to 
males surviving out of 100,(KX) at age 14. 

The calculation of these issue rales by age necessitates a knowledge of 
the ages of jiarcnts at the births of their children. 'I’his information has 
been obtained for many years on the birth certificates of the United 
States, Canada, Australia, and some other countries, so that tabulations 
of the number of births according to the ages of the mothers have been 
available fur such computatiuns.t 

* In rcpiirls nn vital statistics the term “age-specific birth rate** will be encountered; 
it means the number of births (of Iwth sexes) to mothers of a given age, divided by 
the total number of women of that age. When dilTerentiatirm of the births by sex is 
made, the “male (or female) age-s[)ecific birth rale” is used correspondingly to describe 
the male (nr female) births to mothers of a given age divided by the total women of 
that age. 

t In Great llrilain, however, this material was not collected at the lime of the 
preparation of the cost e.stimates for the National Insurance Act of 1011; Sir George 
Hardy and F. B. Wyatt consequently then had recourse to certain New Zealand data 
as to the number of children under 5 left by deceased males, and applieil thereto the 
mortality among the fathers and children, and the number of births, in order to work 
back to the probabilities of issue (see XLV, 421). It may also be noted that 

subsequently a series of select issue rates was published in J.I.;\., XLVIIl, 112, on the 
basis of the census figures of the Borough of Camberwell, showing for six selected ages 
at marriage of females the central issue rates at each age attained (as well as aggregate 
tables for husbands and wives irrespective of the age at marriage). The Population 



210 Population Statistics and Their Cmnpilation 

162. In the reports which arc issued annually in many countries on 
birth registration statistics, however, such elaborate presentations of issue 
rates are not necessary. An indication of the birth rates of the ix)pulation 
from year to year may then be obtained by calculating the legitimate 
birth rates, by age groups, for the married women living between about 
ages 15 and 45, and the similar illegitimate birth rales by age groups for 
the unmarried women aged 15-45. WTiere more compact figures are re¬ 
quired for purix)ses of comparison these rates may then be “directly 
standardized”; or if the rates by age groups cannot be ascertained, the 
crude legitimate and illegitimate rates for all the ages 15-45 may be 
calculated and then “indirectly standardized” on the principles of par. 
130 (cf. Supplement to the 75lh Rcgistrar-Clenerars Reix)rt, Part 111, 
p. xviii, and Newsholme’s “Vital Statistics,” p. 86). The crude birth rate, 
being the ratio of the total births to the total population of both se.xes at 
all ages, has also been used widely; but it is, of course, subject to the 
disturbances of varying distributions as fwintcfl out with resi)ect to tlic 
crude death rate in par. 138. 

163. The probabilities or rates of issue according to the ages and sexes 
of the parents, and their further tabulation by duration of marriage, 
which nnay be required for actuarial purix)ses (par. 161), and the “birth 
rates” which are commonly used in reix)rts on vital statistics for presenta¬ 
tion to the i>ublic (par. 162), are often referred to in general language as 
measures of “fertility.” Care must he exercised, however, in the use of 
this last term; it is employed by statisticians in many dilTerent senses, and 
therefore should always be carefully clelined when any so-called “fertility 
rate” is quoted. 

Since “fertility” in its general demographic aspects must always be of 
interest to vital statisticians and the sociologi.sts, data were collected on 
the census forms at tlic 1910 U.S. Census and the 1911 census of England 
and Wales with rcsixjct to the number of years the i)resent marriage has 
lasted, the numl)er of children bom alive to each marricil woman, and the 
number still living. The English data thus obtained were analyzed in 
Vol. XllT, Part 1, of the 1911 reports, and by Dr. T. 11. C. Stevenson in 
J.R.S.S., LXXXllI, 407 (see also Newsholme's “Vital Statistics,” p. 96, 
and 'r.A.S.A., XVIII, 283). These questions, which were expressly in¬ 
tended to produce “fertility” statistics, were omitted, however, from the 
1920 U.S. and 1921 English censuses. In Great Britain the two new ques¬ 
tions which were introduced in the census of 1921 (see |)ar. 164 here) were 
used instead to give indications of the comparative incidence of “fer- 

(Stalislics) Act of 1938 (see par. 163 here) has now made possible the calculation of 
prolxibililies of issue to women at each age- - see, for example, the Government Actu- 
arj^’s Report on the National Insurance Act, 1946 (Cmd. 6730). 



Statislics Relating to Marriages, Births, Orphanhood, etc, 211 

tility” among various sections of the population in the year preceding the 
census (see the 1921 census report, England and Wales, on “nciK'iidency, 
Orphanhood, and Fertility”). 

In 1946, because such data had been omitted from the censuses after 
1911, the Royal Commission on Population found it necessary to conduct 
a special voluntary “family census” based on a lOr; sample of all women 
who were or had been married, by asking for their ages, dates of marriage, 
dates of birth of their children, and the occupations of their husbands; 
and they recommended (since “at present in (Ireat Britain the arrange¬ 
ments for the collection and analysis of fertility statistics arc not adequate 
to modern needs”) that family census questions slanild be included hence¬ 
forth as an essential part of the decennial census in order to make possible 
the continuous study of family size.* 

In the United States fertility questions were reinstated at the 1940 
census, including (pieries in res|>ect of each woman ever married as to her 
age at lirst marriage ami the number of children ever born alive, (hi the 
1950 census form the (]uestions were asked only for each person (male or 
female) on the dg' ,'. sample lines, and included ciueries on the duration of 
marriage and the number of children ever born alive to each female who 
had ever been married. 

'The 1941 Canadian census asked, in resjHJct of each woman ever mar¬ 
ried, her age at first marriage, the number of children ever born alive, 
and the number living on June 2, 1941 (the census date). Such (juestions, 
however, were omitted from the census of 1951. 

In Great Britain the securing of more adefjuale material concerning 
fertility and marital status has also been facilitated notably by the 
f)assagc in 1958, as a temjxirary enactment reiiuiring periodical renewal, 
of the Population (Statistics) Act. 'I'his measure iirovides for additional 
but confidential en(|uiries (from which, however, statistical tabulations 
may be made) when births, stillbirths, and deaths are registered. With 
respect to births and stillbirths the questions cover the age of the mother, 
the date of the marriage, the number of children by the present husband 
and the number still living, and the number of children and the number 

* The eonclusiiiiis of the Koval CNunniissioii einphasizefl that iliiriiiK the Iasi 70 
years in (Ireat Britain the salient feature of the population chan^'i'S has hceii the fall in 
“the avenige size of the family.” This phrase is U‘,etl hy the CfMiimission to imlitutc 
“the number of children born jier marrir«l couple” it is thus concerneiJ “not with the 
average number of persons in a liou.sehold, nor even with the average number of de¬ 
pendent children in a family, but with the average number of live births to a marriage 
of *coni[>leted fertility,* i.c., one in which the wife has passe<I the limit of the child¬ 
bearing period, [where] in comj)utiiig this average it is cif course necessary ti* include 
childless marriages as well as those which are fertile.*’ See als»i J.K.S.S., CXIV, 41 and 
48 with respect to the technical diflicullies involved in this concept. 



212 


Population Statistics and Their Compilation 

still living by any former husband. Upon registration of a death the data 
secured are (I) in the case of a male, whether he had ever been married 
and whether he was married when he died; (2) in the case of a married 
woman, the year and duration of her marriage, and whether she had chil¬ 
dren by her husband; and (3) the age of the surviving spouse. 

(J) Dependency and Orphanhood 

164. The actuarial problems encountered in connection with dejiend- 
ency at advanced ages in the general {)opulation do not as a rule involve 
statistics beyond the scope of those already referred to in this Study. At 
young ages, however, data are frequently reciuired which hitherto have 
seldom been obtained; a brief reference must therefore be made to the 
nature of the problem thus presented. 

In connection with various tyjws of schemes for pensions to widows and 
children it is often necessary to know the numbers, ages, and sexes of 
children (under, say, age 16) in relation to the ages of their fathers or their 
widowed mothers. Such statistics have been cf>Ilccted for many years in 
New Zealand on the death certificates of males (sec (J. King, J.I.A., 
XXX, 299; J.T.A., XLV, 421; and T.A.S.A., XVTll, 284), although they 
have not been—and still are not- available for other countries. For the 
calculation of ‘‘family annuities” by the “collective methoil” material of 
this type can, as an alternative, be called for on the census schedules in 
respect of living families. Consequent on a realization of the general in¬ 
adequacy of the data bearing on these matters, therefore, two new and 
direct questions were inserted in the 1921 census schedule of England and 
Wales, and Scotland. Those questions asked (1) in rcsi)ect of each child 
under 15, whether the parents were both alive, the father dead but the 
mother alive, or vice versa, or both parents dead, and (2) required, in 
respect of married men, widowers, and widows, a statement of the number 
and ages of all living children and step-children under 16. The value of 
these statistics in the calculation of “family annuities” is j)ointed out in 
J.I.A., LTI, by O. King (p. 37) and Menzler (p. 372), and is well illustrated 
in the Apjxjndix to the British Government Actuary’s Re|x>rt on the 
Financial Provisions of the Pensions Bill, 1925 (Cnid. 2406) which is re¬ 
printed in J.I.A., LVl, 180. Such data, however, are still generally unob¬ 
tainable from population material, notwithstanding representations by 
actuaries with regard to their importance. The reason for this has been 
mainly that census and registration oHicials have been reluctant to add 
the necessary questions to tiie census schedules or the death certificates. 

165. In the United States valuable statistics on family comix)sition - 
even though they are now somewhat out of date, and were affected by 
the low birth rates of the depression period—may be found in Vol. XI, 



Statistics Relating to Marriages, Births, Orphanhood, etc. 213 

Mcmoniiulum No. 45, of the Bureau of Research andStalislics, Social 
Security Board, on “1'he Urban Sample; Statistics of Family Com|K)silion 
in Selected Areas of the United States, 1934- 36.” 'Fhe data, whicli covered 
family size and type, children, employment status, occupation, income, 
housing, race, nativity, and education, were gathered in 1935 -36 through 
a house-to-house canvass by the Works Projects Administration under 
the National Health Survey. 

The number of paternal oq^hans, also, has been estimated for 1941), 
1945, 1950, and 1955 by T. J. W'oofter (see Social Security Bulletin, 
October, 1945). The calculations were made by taking tin* number of 
births by age of father in each of the preceding 18 years, a|>plying the 
death rales of fathers by age to determine the number of deaths among 
fathers, and using survival rates of children to comimtc the number of 
orphans surviving to the sj^ecified year. 

(4) Unemployment 

166. Original calculations of the costs of unemjdoyment insurance 
plans have frcciuently required sUitistics with respect to working time 
lost, and the probabilities of being unemployed within a |)eriod of 12 
months for n weeks or more, which can be collected on the census sched¬ 
ules. Data conceniing employment or unemployment status at the date 
of the census, and the number of weeks worked during the preceding 12 
months, are also of value for many other purposes. 

On the 1950 U.S. census schedule, for exam|>le, each person aged 14 or 
over accordingly was asked whether (yes or no) he or she did any work at 
all last week (not counting work around the house), whether work was 
being sought, whether he or she had a job or business even though last 
week no work was done, and the number of hours worked during the last 
week. On the sample lines, also (cf. par. 12), persons 14 years of age and 
over were asked for how many weeks work had b(?cn sought, and in how 
many weeks was any work at all (not counting work around the house) 
done in 1949. On the 1941 (Canadian census .schedule (|uestions with 
regard to unemployment were included in the h)rin “If a wage-earner, 
were you at work on June 2, P>41?” and “If not, give reason,” and for 
wage-earners only two employment en(|uiries were made as to the number 
of weeks worked and total earnings during the 12 months prior t«) June 
2, 1941 (the census date). 

The tabulated replies to such queries, of course, are subject to many 
inaccuracies, because they deiwiid wln^lly ui^on the knowledge, memory, 
and degree of care exercised by the individual who answers the census 
enumerator’s questions. 



XIV 


THE THEORY OE REPRODUCTIVTI Y 

167. The ideas set out in the preceding sections of this Study dealing 
with mortality rates and the methods of stating marriage rates and birth 
rates, inevitably for many years have led vital statisticians and mathe¬ 
maticians, sociologists, ecoiiomists, and others (now often com[)rchen- 
sively described as the ‘‘population mathematicians** and the “demogra¬ 
phers**) to seek a compact mathematical-statistical measure which might 
give reliable indications of the trends wliich result from the combined im- 
ixacts of these varying rates of birth, marriage, and mortality. Obviously 
this must be very diflicult, for tlie rates are liighly complex and arc con¬ 
tinuously chjinging- sometimes quite sharply; moreover, marriage and 
birth rates are subject to the voluntary controls of individuals, and those 
controls arc influenced largely by racial, religious, social, and economic 
forces, and by periods of peace or war. 'Fhe best that can be done can only 
be the construction of some kind of mathematical model, basetl upon 
greatly simplified and technically manageable su])positions. Such hypo¬ 
thetical models must, of course, be viewed in contrast with the known but 
unmeasurable complexities of the rates actually involved; both in their 
comjiosition and their use it must be clearly understejod that at many 
points they substitute fanciful imagination for reality. Their practical 
utility is conse(jueiitly open to considerable debate. Their theoretical 
foundations, however, must be included here because now they are en¬ 
countered widely in the literature, and the techniques which they involve 
are interesting despite their demonstrable weakne.sscs. 

In i)ars. 161-63 consideration was given to the meanings and methods 
of stating “probabilities or rates of issue” and “birth rates,” aiul attention 
was directed to the im]iortance of detining “fertility*’ in any discussion of 
that word. Two other general terms, namely, “reproduction” and “rc- 
productivity,** will also be encountered—^freciuently in a sense almost 
synonymous with “fertility** or “births,*’ though again in a recent tech¬ 
nical terminology where their precise meanings and implications must be 
carefully defined. The theory underlying the “measures of rcproductivity** 
to be now examined will be followed with case only when the special 
definitions of this recent terminology arc thoroughly understood. 

168. The Gross Reproduction Rater Let /i-(.v) denote the female birth 
rate at age jr —that is, as in the footnote to par. 161, the “female age- 

U4 



The Theory of Reproductivity 21S 

specific birth rate,” being the female births to mothers aged x divided by 
the total women aged * and suppose that the mothers’ child-bearing 
ages are covered by ages X to Z, only. It will then be evident that the 

simple total ^ will give the number of daughters who would be 

bom to each woman during the childbearing ages X to L, so long as (//) 
the woman is not subject to the risk of mortality between ages X to and 
(ft) it is considered appropriate in the calculations thus to compute the 
daughters by using values of determined from a short and fixed ob¬ 
servation period (1944, which may be called tlie “hast* period,” in the 
example here) which ignores the mother’s “generation” (cf. Section X). 
These conditions may be seen easily from column (2) of the table on 
p. 217 (taken, with slight rearrangement and amplified column headings, 
from A. II. Pollard’s paper on “The Principles and Limitations of Fer¬ 
tility Indices,” Actuarial Society of Australasia, 1947) which with illus¬ 
trative data shows the form of calculation in 5-year groups adopted in 
many similar examples to be I’ouiul in the writings of Kuezynski and 
others. The index thus computed (which was first used by Kuezynski) 
has become known as the Female Cross Reproduction Rale. It will be noted 
that it is independent of the actual age, sex, and marital distributions of 
the population (here a iiarticular ])opulation in 1944) for which the rate 
is computed. Its value 1.289 (in the example) is claimed by its advocates 
to suggest that 1,(XX) women living through the cliild-bearing period of 
this generation without .any loss by death would re[)lace themselves by 
1,289 women in the next generation, and consequently that the births are 
here more than suflicient to “maintain” the ]X)pulation i.e., the popula¬ 
tion is more than “reproducing itself,” and therefore is “steadily increas¬ 
ing.”* 

This concept, however, is obviously most unrealistii*, and indeed 
meaningless—particularly in its su])]X)sition that the mothers are not 
subject to mortality, ft is stated here only because it is the first step in 
the development of the “net” reproduction rate (par. 169), and also be¬ 
cause it has been described as providing evidently an upper limit which 
the net rate would attain if mortality were to improve so greatly that it 
would become negligible. 

'fhe gross reproduction rate explained in this jxiragraph is the female 
rate, relating mothers and daughters. A corresixniding male rate tracing 

* These phrases are placed in quotes because they constitute the customary lan¬ 
guage of the proponents of these theories, and because on careful exaniinatioii they 
reveal the objectives and practical limitations of the methods. 



216 


Population Statistics and Their Cofnpilation 


sons and fathers can, of course, .also be computed with a similar rationale 
The female rate, however, has usually been employed in ]:)ractice on the 
grounds that females actually l)ear the children, the range of ages X to L 
is shorter for women than for men so that the calculations arc easier, and 
the required data arc more readily available for females. 

For the purjjoses of mathematical analysis, the female and male gross 

reproduction rates can be state<l as / dx and / /ji/(.v) dx , 

•'X •'o 

where frix) and /vCv) arc the instantancfms femalc-to-fcmale and male- 
to-male birth rates, and a and jS for males are the age limits corresi)onding 
to the female X and A. Since the /’s arc zero for all vahu'S of x beyond 

these limits, the integrals can be takeii as / f p {x) d x luu] / /.uf.v) (/.v. 

169. The Net Reproduction Rate. -When the mortality of mothers, 
which is ignored in tlie gross reproduction rate, is introduced, it will be 
clear that 






where the F in jp^^ and indicates that those values arc taken from an 


IF 


appropriate female life table, will give the total number of daughters who 
would be born to each original female baby during her lifetime again on 
assumption (M of ]>ar. 108. This imlex, which was first usetl by R. Ih">ckh 
in Germany and .subsequently was adopted l>y Kuezynski and many 
others (including the central vital statistics olTices of .several countries), is 
called the Female Net Reproduction Rate. In column (.S) of I he example 
here its value 1.176 would be used t*) suggest that 1,()(K) female ba))ies (of 
one generation) would in their turn, and with due all«)wance for their 
mortality prior to and during their child-bearing i)eriods, produce 1,170 
female babies (of the next generation), and consiMjuently that the “lial- 
ance of births and deaths” is nn)re than sulVicient to “maintain” the 
population i.c., the i3opulation is more than “reproducing itself” and is 
therefore “steadily increasing.”* 

With the same notation as in par. 108, and writing P Utr the values 
from a male table in contrast with for the female values, the female 
net reproduction rate is 


See footnote to the previous paragraph. 



The Theory of ReproduclivUy 


217 


and the male rate is 

jr'’({^)/.u(.v) Jx. 

Extending the limits as before, the female rate can be written more com¬ 
pactly as fpF(x) dx , where 


ILLUSTRATIVE APPROXIMATE CALCULATION l-OK FEMALES OI- THE (;k(iSS RE¬ 
PRODUCTION Rate, Net Reproduction Rate, and Imierent Rateoe 
Natural Increase, for 19«, Usixo 5 -Year Ac;e Croups 


-—-. 

—L--.- — 

. - -- - 

. 

—.... 




Female 


. tx 

Daurhters 




Age Croup 

Assumed 

'>’• 1. 

Rorn In E.’irh 



Age 

Croup 

Hirth-Rate* 
to Mothers, 
in Base- 

Mid]ioinl, 

of Ai;e- 

from Female 
Idfc Table 
Appropriate 

Original Fe- 
tnale ILiby 
during Her 

C'l.xt.o 

<5)X(3)* 


Period 1911, 

Croup 

to liase- 

Lifeliiiic 






Period 1944 

-=.Mi2)X(4)| 

-A*; 

-Ai 

(1) 

(2) 

(0 

(1) 

(.S)t 

(hit 

(7)t 

19.. . 

.01102 

17.5 

.93704 

05165 

.904 

15.817 

20-24.... 

.06249 

22.5 

.92888 ■ 

.2‘X)25 

6.5.«) 

146.9.16 

25 29.. . 

.078.15 

27.5 

.91789 

.3.S911 

9 876 

271..S76 

.«)-34. . . 

.0.S948 

32.5 

.905.17 = 

. .16911 

8.749 

284..I.S6 

39.. . 

.03501 

.17.5 

.89031 : 

.l.S.^Sf) 

5.845 

219.180 

40 41. . .. 

.01069 

4.1 5 

8721x8 . 

.01666 

1 9S.I 

81..184 

45-. 

.(MX)88 

47.5 

.8.=^06.S 

.00376 

179 

8.484 

'rotiii .. 

.25782 


! 

1.17(o0 1 

1 

34 066 
=^/?| 

1,030.(1.13 

= Ri 


Female Gross Reiirodiiction Kale — 5(.i578i) = 1..W. 

Female Net Re|)ro(luc1 ion Rale = /?o 1.176. 

Calculation of the Iiilierenl Rale i»f Natural Increasr, r, liy (9.1); a = /^i/Ro — 

28.9.S5; ifi/Zei - 876.017; }/ti = Jfa* - - -18.8125; Ior. Ri = .162.S42. 

Hence, by (93), -18.8125r* f 28.955r - .16.LS42 -- 0, from wliich the a.lmissihle s«)lu- 
liont is r = .00563. 


* For example, .06249 is taken as 


Rirllis to mothers 20 21 l.li tl. 
Female ]ioiiiiI:ition 20 2t 11i d. 


Instead of u.sinK female birth rales to mtitlicrs as here, the tfilul rate male anil female births to nfiotiirrs is 
tiften cn^iiyeil in pracliee, with siiliseqiicnt adjustment by mtill.|il>iiiK the results by the ‘■e\ raln) at birth 
(see R. J. Myers, T.A.S.A., XLI, 94 90 and 101, and lefeicnces there eited). 


t In cols. (.S), (6), and (7) the figures shown follow thoM- in Pollard's imiier; as expl.uneil by iLiauthor, 
they have been cut down from his oiiginal extemlcd raleulations wliieh actually were made with a greater 
number of decimals throughout. 

t The two solutions of this riiiadratic are .00.S61 and 1 ..S.US6. 'The latter, however, i-i evidenllv inadmis¬ 
sible, lictaiuse (fl) f elearly must lie in the iieighlMirhiwMl of iMt.Sh ■.illation inere-iM: of slightly 

over 1% p.a.), since .00.'i61 is the single soiutiuii imlicaled by tli* ■ • ■ ■ ' . hLi ted in the fisit note to 

equation (9.1) in the text; (fc) 1.534.S6 is thus an extraneoii.. soliiiion, just as a cubic or higher cxO nsion ol the 
apiiroximatc (9.1) would produce still other extraneous riMiIs; and (i ) a |Nipiilation increa.se <»f 1.S.I o P a- «»' 
viously must be dismissed as being iin|ios.siblc on general grounds. 




218 


Population Statistics and Tlteir Compilation 

has been called the female “net fertility function” (being the probability, 
at birth, that a female child when she attains age x will, for the next 
generation, bear a female child). 'J'he corresjjonding male rate would 

-CO 

then be / ^pm^x) dx where the male “net fertility function” spm{x) is 
•/o 

i.e., the probability, at birth, that a male child when lie attains age x 
will then become the father of a male child. It is to be noted that the net 
fertility functions and ^A/(.r) involve the rates of mortality and 

birth explicitly, and also the marriage rale implicitly. 

170. These concepts require close examination. Firstly, when com¬ 
puted in practice by the formula of par. 169 as in the numerical example 
here, they implicitly involve Ihe assum])tion (b) stated in jxir. 168, and 
thus in effect use different generation values of /'(x) since all the values 
of /'(x) are ordinarily derived from a fi.xcd period of observation (the base 
|)eriod) which involves different generations to compute an index which 
is suptxjsed to forecast the birtlis of a single generation. 

Secondly, the values of the birth rate/'(.v)—which implicitly involve 
also the antecedent marriage rale at some earlier age in res|x?ct of legiti¬ 
mate births- are usually based on a jxirticular short |)eriod of observation 
(here the base |x?riod 194-1), and the values of the mortality function ^po 
likewise are generally* taken from a life table based on some [)arlicular 
short period of observation aj)propriate to the base period; and with 
these values an attempt is made to estimate future births over a long 
span of years, on the assumption that the values will not change-re¬ 
gardless of the fact that they are almost certain to change apprecuibly. 
Objection to this feature of the calculations has been i?xpressed by 
several critics. J. Hajnal, for instance (in Population Studies, F, 137), has 
remarked that “to look uixm the long-term prosixjcts of population 
growth following from the maintenance of the situation of a given year in 
terms of rates which are certain not to remain as tliey are is clearly an 
unreasonable proceeding.” Tn an official Apfiendix (p. 246) to tlie Kcjxirt 
of the British Royal Commis.sion on Population the fundamental defect 
of the net reproduction rate is stated by W. A. B. llopkin to lie in the fact 
that “it defines the demographic habits of the ix>pulation in terms of the 

♦ The Registrar-General of England and Wales has modified this customary jiro- 
cedure by using the lower torecast mortality rates to lie expected in the future in order 
to finci what he calls an “effective” reproduction rate. This modification in the com¬ 
paratively unimportant mortality assumption, however, usually produces only a trivial 
difference in the third decimal place of the ordinary net re])roductlon rale. 



The Theory of Re productivity 219 

age-specific fertility and mortality rales of a particular year or series of 
years; tliis procedure may not be seriously unsatisfactory on the side of 
mortality . . . but on the side of fertility it is ciuitc inadequate. .. [be¬ 
cause] recent experience lias shown that the age-specific fertility rates of 
any one year, or even of several years together, may be substantially 
raised or lowered by factors which have nothing to do with the basic 
habits of the iwpulation.” W. Perks (J.T.A., LXXIV, 327) observed that 
‘‘the measurement of reproductivity, i.e., the extent to which the ixipula- 
tion was reproducing itself . . . [is] essentially a generation idea, and the 
troubles . . . arose out of attempts to measure a generation concept by 
means of data taken over a short period, and particularly the fertility data 
over a short j^riod/’ Or, to put the matter in other language (paraphras¬ 
ing slightly F. A. A. Menzler’s similar criticism in LXXVI, 53), 

these reproduction rates purport to give a summary of the way ]jopulation 
influences arc working out at a given ixiint of time, if all the forces then 
ojierating continue to operate unchanged in the future (and thus for lay¬ 
men they seem to “dramatize the consequences of the demographic posi¬ 
tion obtaining from time to lime”)- As it is well known, however, that 
such forces especially the marriage rate and the birth rate, both of 
which are actually involved in /'(.r)-- -do not oi)erale without changes 
which arc sometimes drastic, reproduction rates sc^ calculated can have 
no rctal value for the purpose of determining the likelihood of a ix)pulation 
reproducing itself. 

'J'liirdly (as in the case of the gross rate), the net reproduction rales 
ignore the actual age and sex distributions of the jxjpulation, although 
the marital distribution is reflected indirectly in the values of the birth 
rate/'(.v); then, notwithstanding the fact that the actual distributions 
must in reality determine the population’s reproductive capacities which 
the rates essay to measure, they calculate a thcwetical number of births 
based on a purely hypothetical life-table ix)pulation which in its distribu¬ 
tions by sex and age (and without any regard for marital condition) has 
no necessary relation to the realities, whether with respect to the present 
or the future or any other actual population. 

171. These features of the calculations must be interpreted as warnings 
that the net reproduction rate will not always pmvide any reliable indi¬ 
cation as to whether a given jjopulation really is or is not reproducing 
itself. Actuaries, who are thoroughly schooled in the hytwthetical nature 
and computational uses of the fictitious “stationary population” of the 
life table (see the footnote to par. 64 here), will hardly need to be warned 
that any attempt to compress the many varied mfluences of “reproduc¬ 
tivity” into a single composite index-number like these reproduction rates 



220 Population Statistics and Their Cmnpilation 

is foredoomed to the same kinds of difficulties and failures which vitiate so 
many of the conclusions whicli to la 3 micn seem at first glance to be de- 
ducible from crude death rates, expectations of life, and the so-called 
“life table deatli-rate” based on varying age and sex distributions (cf. 
par. 138 here; “The Fundamental Principles of Mathematical Statistics,” 
pp. 113-14; and F. A. A. Menzlcr, J.I.A., LXXVI, 53). The net reproduc¬ 
tion rate, nevertheless, in recent years has captured the imaginations of 
the public, the pre^ss, and the politicians, just as the expectation of life 
unfortunately is still believed by many laymen to give a reliable and 
complete picture of the prospects of longevity for nations or even in¬ 
dividuals. Policy with resi)cct to future jK^pulation growth and economics 
cannot be founded safely u|wn any such index-numbers which generally 
hide much more than they reveal. Again it may be said here (cf. Wolfen- 
den, T.A.S.A., XXXV, 282) that perhaps some day we may succeed in 
disseminating outside actuarial ranks the truth about what the “life 
table,” the “expectation of life,” and now the “net reproduction rate,” 
cannot do. 

The calculation of a net reproduction rate (whether in the form already 
stated, or as modified in pars. 172-73 hereafter) thus obviously cannot 
under any circumstances replace a detailed analysis of the mortality, 
marriage, and birth rates by ages and durations as a means of examining 
the trends in marriages, the number of children per marriage, and the 
ultimate size of a population (see also J. Hajnal, Population Studies, f, 
137, and B. Benjamin, J.I.A., LXXVI, 41).* 

172. Net Reproduction Rates for Males and Females ,—For the same 
reasons as those stated at the end of par. 168 for the gross rate, the net 
reproduction rate is usually computed for females, although the male 
rate is also sometimes given. The numerical values of these male and fe¬ 
male rates, however, <lo not always give similar indications; R. J. Myers, 
for example, in his paper on “The Validity and Significance of Male Net 
Reproduction Rates,” J.A.S.A., XXXVI, 275, shows instances of male 
rates greater, but female rates smaller, than unity, and ]X)ints out that 
such differences are due to the relative proportions of the male and female 
lx)pulations which are, in fact, reflected indirectly in the values of fhi^x) 
and /i(x) based on those px)pulations. Such parallel constructions of 

* All alternative statistical approach on such lines may he found in P. R. Cox’s 
paper on **Keproductivity in Great Britain: A New Standard of Assessment,” J.I.A., 
J..XXIX, 239. The ditTcreiices aic there examined between the numljers of children 
actually observed, and those theoretically required for a model self-reprcKlucing popu¬ 
lation computed on the 1)asis of the various mortality and marriage elements involved 
(with emphasis on the duration of marriage as one of the most important factors). 



221 


T'^ie T/ieory of ReproJiictivily 

male and female rates, moreover, illustrate the anomaly that there is a 
basic mathematical and practical conllict between the male and female 
rates when both arc computed and compared. 'Die theory of both rales, 
which are calculated by means of functions based on the life-table concept 
of a “stationary ixjpulation,” supix)ses that mortality and fertility remain 
unchanged and that there is no migration; furthermore, rates llius based 
theoretically on the life-la])le hyjwthesis can actually occur at their pre¬ 
cise calculated values only in a population which has in fact attained a 
“stationary condition.” In any given jxijmlation, therefore, the exact 
theoretical net male or female reproduction rate would not be likely to 
occur at the time of its calculation- the rate then might in reality be 
larger or smaller; if, however, the assumed mortality and fertility per¬ 
sisted without migration, eventually the iK)pulation would reach a stable 
age and sex conijMsition in which the calculated reproduction rate would 
actually appear (see also par. 174). But if the male and female rates are 
sujiposed, as the calculations may indicate, to be not the same, the 
ultimate outcome in this eventually stable |Hipulalion must then be that 
in the course of time one sex would completely submerge the other; that 
is to say, in the linal result different male and female reprodui tioii rates 
cannot co-exist. 

This anomaly between the male and female net reproduction rates, 
which is a major objection to their use, has been examined mathematically 
in several papers- particularly by P. II. Karmel (in Population Studies, 
I, 249, and 35d, and If, 240, and a summary in J.I.A., LXXIV, 329) and 
A. II. Pollard (“The Aleasurement of Reprodiictivity,” J.I..\., LXXIV, 
288). Using the ajiproach adopted in a long series of contributions by 
A. J. Lotka, and e[)itomi/.ed conveniently by K. (?. Rhodes (in three 
papers on “Population Mathematics” in J.R.S.S., (’III, 61, 218, and 
362), let /If(/), /fi/(/), and Br(i) denote female, male, and total births at 
time /, and ipF(x) ami ^.m(.v) the female and male net fertility functions 
already delined. 'Phen — .v) represents the number of female children 
born at time / — x, and the number of female children born to them when 
they are ageil x to x -p r/.v, at time / to / -p dx, is /If-(/ - .v) ^|.•(.v)^/.v, where 
necessarily t > x > X; consequently the total female births at time /, 
for all ages of mothers in their child-bearing years, is given by 

By(t) = (75) 

Since ^p(a') is zero beyond the limits X and L, as before, this may be 
written 

BfU) = f BF{t-'X)ipF(x)dx. 


( 76 ) 



222 Population Slaiistics and Their Compilaiion 

Similarly, 

Bm(0 = rihi(t-x)<p.uix)dx. (77) 

•To 

In his analysis of the male-female anomaly, Karmel points out that these 
two relatioTis are conneclcd by a third, namely, 

%/(/) =ffiIh-U), ( 78 ) 

where m is the masculinity at birth this relation being essential because 
its omission “would imply that the masculinity at birth would, for large 
/, appnnich 0 or , i.c., either the male or female {population would outstrip 
the other” (J.I.A., J-XXIV, .130). Since the mathematical model thus 
representing the births is delined by the three functions ^a-Cv)* 
and w, and the unknowns /ij^C/) and BmU) are only two, the system is 
over-fletcrminate. 'Fhe reason for the anomaly can thus be seen. 

In this situation Karmel and Pollard have discussed the several {pos¬ 
sible solutions which follow. 

(/) 'Fhe usual practice of demograiphcrs has been to use the system 

(76) , (77), and (78), but (as remarked in [par. 168 here) siin{)ly to omit 

(77) , which certainly is not correct. 

(ii) The total births couhl be cxi>ressed in the corres{K)n(ling form 

Brio — f Brif — •*’) ^y*(-v) </.v , (79) 

•'o 

where in iprix) the total birth rate,/r(.Y), would be based on half the total 
births to persons aged .v .since each birth would be countcrl twice; and 
from (78) and the condition Bm^I) + BjrQ) — Br{t) we should have 

'’"'"-(its-)'’-'" 

and 

«'<'> -(i+, 

'rhis system (which was mentioned by K. K. Kuezynski in his “Fertility 
and Re{)roduction”) is determinate. Karmel, lupwever, views it (in 
J.I.A., LXXIV, 330 -31) as unsatisfactory because Brix) must de{X‘nd on 
the varying sex distribution of the [population. 

(///) A. II. Pollard (in J.I.A., LXXIV, 307-13 and 330) suggested 
tracing the total births by following the female children of males and the 
male children of females. Tf accordingly we here use ^(.r) to denote the 
probability at birth that a male child will have a female at age .v, and ((y) 
for the probability at birth tliat a female child will give birth to a male 


(80) 

( 81 . 




The Theory of Re productivity 

22.1 

at age y, the female births 5y(/) would be /* Bud — x)ili(x) dx and the 

•'O 

male births Bm(J) 

would be Jf BfO — y) ^ (y) dy . Consequently 


BfU) 

= f -x-y)di (.r) f (y) dxdy 

(82) 

and 



B^,{t) 

= f f B.\t d — x — y)\l/(x) f (>) dxdy 

•'O •'0 

(83) 

while also (as rom 

larked by Karmel, LXXIV, 3.^1) 



/hid) = md)Bxd), 

(84) 


where ;«(/) is the niasciilinity at birth expressed as a function of time. 
Here tliere are three unknowns and three eriuations, so tlie system is 

determinate. The expression / / ^ (.v) {(y) tlxdy can t hen be used as a 

*'» “'ll 

unicjue Joint Reproduction Rate which is indc|)cnflcnt of sex (a numerical 
exami)lc of its calculation being shown in J.F.A., LXXIV, .M2). This 
solution, of course, still attacks the f)roblem only in a formal mathematical 
sense, and therefore is again o])en to the imictical objection (see J.I.A., 
LXXIV, 321, 325, 327, and 331) that its !)asic assumptions, as in the 
earlier formulations of the theory underlying reproduction rates, arc un¬ 
realistic. 

(/>) Criticizing equations (S2) (84) from the viewpoint that the func¬ 
tions ^(.v) and 4(y) do not correspond to reality since males do not 
produce female children exclusively and vice versa, and that the mascu¬ 
linity ;;/(/), which in reality is almost constant, can vary peculiarly under 
condition (84), P. 11. Karmel in J.I.A., LXXIV, .^31-32 set out the equa¬ 
tions involving directly a marriage function and birth rates to couples - 
instead of following the female-to-female and male-to-male births as in 
C(iuatioiis (7()) (78), or rollard’s feniale-to-inalc and malc-to-femalc 
system of (82)-(84). Under the simplest conditions on this basis, A/li/(.v, y) 
could represent the ])robability at birth that a male will be living and 
married at age y to a female aged .r, 3 /a-(.v, y) the probability at birth 
that a female will be living and married at age .v to a male aged y, and 
ft(.v, y) the rate of female births to couples aged .v and y, so that 

/■I/. (/) -■= / f y) /M.v, y) dxdy ^85) 

•'ll •/q 

/hid) = f f Ihid - v) Mii(x, y) mhix, y) dxdy (SO; 
•/o •/() 



224 


Population Slatisiics and Their Compilation 


and 

BM{t) (87) 

This system, however, is a^ain over-determinate. 

(lO In J.f.A., LXXIV, 3.52 34, Karmel therefore explored an approach 
(of which, incidentally, the usual method of calculating only a female rate 
is a special case) by recasting the nuptiality conditions. 'Phis led to more 
complex equations which arc determinate. The method at one j 3 oint 
takes an arbitrary average, however; and after an elaborate analysis 
Karmel concluded that “as yet it rloes not seem to be of practical utility.” 

173. Reproduction Rates Based on Afore Refined Birth Rates .—All the 
reproduction rates discussed in the i>recoding j)aragraphs have been 
constructed by using birth rates founded only on the ages of the mother, 
father, or the couple, llirth rates, however, depend largely on marriage 
durations as well as i)arents* ages, and also on order of birth, the number 
of children born alive to each mother, and sterility and spinsterhood. 
Suggestions have conse(juently been made with the object of improving 
the bases of reproduction rates b}' taking such factors into account. 

(0 A net reproduction rate based on ages at and durations of marriage 
was pr(j|X)sed by C. (.'lark and K. K. Dyne (in the Kconomic Record, 
Australia, XXII, 23). If the ratio Jbr represents the annual female births 
during the base year at marriage duration r and age at marriage y* divided 
by the ('orresfMinding annual marriages, and is the pro|X)rtion of 
females aged y who marry at that age, the female rate can then be 
written as 

A numerical example and analysis of this inilex has been given by A. H. 
Pollard in J.I.A., LXXIV, 2*13 305. It requires tabulations of births 
according to the mother’s age at marriage and calendar year of marriage. 
'Phis information is available in Australia where the method was proy.M)scd, 
but is not recorded on most birth certificates such as the U.S. form 
shown in par. 22. 

ill) The defiendence of birth rates and repnuluction rates uywn order 
of birth, number of children bom alive, and sterility and synnsterhood 
has been examined by P. K. W'hclyiton (“Rej)roduction Rates Adjusted 
for Age, Parity, Fecundity, and Marriage,” J.A.S.A., XLI, 501). Using 

* This |)n»mlurc is a dcvclopincnt of an earliiT but less salisfaclory nielhocl due to 
r. IT. Karmel (Kconoinic Kecord, XX, 74, and sec LXXIV, 292), which was 

based on values of hr for marriage durations r irresi>cclive of ages at marriage (instead 
of Jir for marriage durations r and ages at marriage y as used by Clark and Dyne). 



225 


The Theory of Re productivity 

the term “parity” to mean the numlicr of children born alive (in the sense 
that a zero-parity woman has had no live child, a lirst-parily woman has 
had one, etc., so that onl}'' the woman of ;/-parily can be exi)osed to the 
risk of bearing a child of u + 1 order), Whclplon illustrated the calcula¬ 
tion of female gross and net reproiluction rates, and intrinsic rates of 
natural iiKT(*ase (see par. 174 here), from birth rates which took into 
account j)arily as well as age, and also were refined further to allow for 
sterility and spinsterhood by assuming that UK', of women are sterile 
and lOVf' cannot marry before age 50. The numerical effects of these 
adjustments were not large. The investigation served, however, to direct 
attention again to the internal contraflictit)ns which arc implied in the 
construction of reproduction rates from birth rates based on age alone 
(cf. A. J. Lotka, Annals of Mathematical Statistics, XIX, 205). 

(///) “Cleneration” re|)ro(luction rates, based on the total number of 
children ever born to a ]iarticular generation of women throughout their 
child-bearing years, have been discussed by M. Depoid (“J^tudes Demo¬ 
graphic pies,” No. 1, Slatisti<|iie (leiieral de la France) and T. J. Woofter 
(Human Budogy, XIX, and J.A.S.A., XLIV, 500). Such rates possess the 
merit of replacing the assumption of unvarying mortality and fertility 
by the different rates appropriate in successive calendar years for a 
designated generation of women (between, say, ages 15 and 45); but they 
have the disadvantage tliat the calculations are extended over a long span 
of past years. 

(/r) A “cohort” re|)lacement rate has therefore been suggested and 
illustrated by 4'. J. Woofter in J.A.S.A., XLIV, 512, with the object of 
obtaining a measure based on data closer to the current year. He remarks, 
however, that the resulting rates “refer neither wholly to the [iresent nor 
wholly to the complete experieiu'e of generations,” and thus are still 
affected by the insoluble dill.culties which largely vitiate all these at- 
tem|)ts to find a single numerical reproduction index. 

174. The tuhcreiil Rale of \alurat tnerrase. Kc|uations (76), (77), 
and (79), for females, males, and persons resj)ectively can be written 
generally, by dro|)ping the subscri|)ts, as 

BO) - rB(t-x)^(x)dx. (88) 

•'ll 

This fundamental equation, as will be apparent from its derivation in 
par. 172, does not n*fer explicitly to any initial state and therefore holds 
regardless of the origins of the i)oi)ulation. 4'his, of course, is both natural 
and proper, for evolution in nature is a continuous process starting from 
origins of which we have no knowledge. As will be noted later in par. 175 



226 


Poptilalion Statislics and Their Cmnpilation 

and formula (90), the arbitrary origins of the population are represented 
in the mathematical development by coefficients Qn, so that the problem 
becomes determinate. 

It will be evident, moreover, that the birth function, B{i), although 
certainly continuous, will not be expressible under the assumed conditions 
by the same function of t cat all points during its development. This may 
be seen, for example, from the synthesis given by E. C. Rhodes in J.R.S.S., 
CIJ I, 62 et scq., Iracing the number of female births year by year which 
follow, on the assumptions here under consideration with X cand L de¬ 
limiting the cliild-bearing ages, in successive generations from :V female 
chihlren all b»)rn at time 0.* A single function used throughout its range 
therefore will give only an ajif)roximatc representation of B{t). If, never- 
theh.\ss, we assume that the same function for B(f) can be substituted on 
both sides of (88), it can be seen that the ec|uation will be satisfied, for all 
values of /, by a series of exponentials 


/i (0 - + (>a c'*' +. . . = LU e"*' (8 9) 

so long as 

"V (.v) dx - /’“ e ''-V (.V) dx, 

•'O •'0 


that is, so long as 



(.v) rfjr = 1 


for each value of r„.t 'riieoretically the roots ri, ro, . . . , real or complex, 

* IJmlcr llmsc cirrunislaiiccs n«i diildmi will be born while the origin.*!! .V arc 
Krowiiij? lo maliirily at a^c X, so llial ihen Bit) is 0; between X and 2X the chil¬ 
dren born at lime / will be AV(d; from 2X lo .SX children of the second generation 
will also ajipear because some of those bom after X will have reached maturity, so 
that BU) will again l>e ililTerent; and so on. Generally, the result can be written 
Bit) — A’l^f/) + vj(/) + ^ 2 (f) -f . . . ]. lor large values of t the earlier functions of 
course, make no contribution, and the number of functifms involved is limited. 

t The mathematical treatment given in this purngra|)h is a condensed .statement of 
the theory as it has been evolved since about 1907 by numerous authors through many 
papers written in French, German, Sweilish, Italian, and Finglish. Tn 1909 L. Herbelot 
(Bulletin Trimestriel de I’lnstitut des Acluaires Fran^ais, XIX, 293) stated the 
fundamental integral lYiuation of the type of (8iS) in an actuarial problem, u.sing for its 
solution the nietluxl of successive difTerentiations—this method also being employed 
subsecjucntly by Kisser, Zwinggi, and Schultless in France and Switzerland. Tn PX)S, 
however, P. Hertz (XIathematische .\nnalen, LXV, 1) and G. Ilerglotz (iWJ., LXV, 87) 
had given a convenient form of solution (as employed here). In 1911 this solution was 
therefore adopted by F. R. Sharpe and A. J. Lotka (Philosophical Magazine, XXI, 435) 
in an investigation of the problems of age distribution and natural increase; ami there¬ 
after numerous papers appeared on various aspects of the questions involved. Among 
these pa])ers it may be noted particularly that in 1931 S. D. Wicksell (Skandinavisk 



The Theory of Re product hity 227 

of this last equation are infinite in number; there is, however, only one 
positive real root (since jT e dx , in which (p(.v:) is by nature a jiosi- 

tive function, decreases as r increases). 'I'he complex roots deline the 
oscillations about the real root which gives the ultimate constant value 
of r. Furthermore, it om be shown (cf. Rhodes, loc. cit., p. 77) tliat (8‘)) 
tends to Qe'' as / approaches <»—that is to say, the ultimate fi)rm whicli 
the birth function B{t) attains, under the assumed conditions of li.\«i 
birth and mortality rates by age, is (V". ('Phe cmitrilmtions to Ji(t) given 
by the complex roots, and the manner in which li(l) when / is large lakes 
the form Qe'‘ where r is the real root, can be seen from tlie numerical 
illustrations in J.R.S.S., Cl 11, 85, ami in several of Totka’s pajK'rs.) 

Starling again, as in jxir. 172, with K{1 -- .v) birtlis at lime / ~ .r, the 
number who survive and are aged x to x i- dx at time / will be 
B{l-x)jp^lx. Writing p{x, I) for B{t — x):rpu, and substituting the ultimate 
form of the birth function just established, it follows that when / is large 

P (.V, /) = () e'C ^\p„ () r'l ( c "^p„). (00) 

Als<j, the total poimlation at time /, say i^/), is / p{Xj i) dx where o) is 
tlic limit of life; conscquenlly from (90) when / is large 

r(i) He ^',podx. ( 91 ) 

•^11 

From (90) ajid (91) the ratio 

/Hi) 

Akluarietiilskrift, XIV, 125) olitiiinrcl anotlicr soliilion (for which stvJ.I.A., LXXIV, 
291) i)f till! integral criualinii, hy lilting a 'IVjie III IVarsDii ciirvo Id Ihc net iVrtility 
function ^(.r), which in practice gives identical results, and that A. J. Lntkaointriliuted 
largely to the development of the subject. A valuable and extensive bibliography, anil 
coinineiils on this history, may be fouml conveniently in I^otka’s paper in the .\imals 
of Mathematical Statistics, X, 1. The original sources should be consulted for rle- 
tailed treatments of some of the more rigorous mathematical points which arise. The 
I)aj)crs of K. C. Rhodes in J.R.S.S., CHI, 61, 218, and 562 provide an especially valu¬ 
able summary. 

jVn eiitirehMlifTerent mathematical approach based on matrices and vectors has !»een 
given recently by 1*. If. Leslie see his |Kipers on ‘‘The Use of Matrices in (Vrtain 
Population ^^athemalics” in Hiometrika, XXXIII, 18.< fwith further note's in the 
same journal for PH8), and “The Distributiim in Time of the Hirths in Successive 
Generations,” J.R.S.S., CXf, 4-1. .\. J. Lotka has observed (Annals of Mathematical 
Statistics, XIX, 2(M) that “the first afiplication of the matrix method to these prob¬ 
lems seems to be due to II. Uernardelli, ‘Pi>i)ulalion Waves,' Journal of the Rurma Re¬ 
search Society, XXXI (1941), 1.” This treatiiieiit, however, need not be included here. 



228 


Population Statistics and Thvir Compilation 


being the proporlion of the total ix)pulation at lime / whicli is aged x to 
X + dXj is then seen to he independent of / when / is large- that is to say, 
the population, wliirh is subject to unchanging mortality and birth rates, 
and regardless of its origins, ultimately attains a stable age distribution. 
This proi3osilion was first eslablislied by Sharpe and Lotka in 1011 
(Philosophical Magazine, XXI, 4.^5 sec also Proceedings of the Kiglilh 
American Scientific Congress, VIII, 290). 

j r*** 1 . . . 

' "xpodx = ; which is independent of /, (01) bec«)mcs 

I) o 


PU) = 


Qe^ 

b ■ 


In this population the rate of increase is r. It therefore may be said 

that a i^opulation which is subject indefinitely tc) unchanging birth and 
mortality rates by age (with no immigration or emigrali«)n) wouKl 
ultimately, whatever its origins may have been, settle down to a stable 
age distribution whicli increases (necessjirily by an excess of births over 
deaths) at the inherent rate r. This is called T/ir Inherent Rale of A alnral 
Increase* It is, as will be seen from the preceding analysis, the value 
which satisfies the integral ecjuation 

I e ^'ip{x) dx “ 1 . (92) 


'fhe numerical solution for r in (92) cannot be elTected directly. A 
sullicienlly accurate value, however, can he obtained (see A. J. Lotka, 
J.A.S.A., XX, .klO- .12) by solving the quadratic e(|uation 




(93) 


where 


r .vv(.v)rf.v, 

u 


Ri 

r!: 


and 


J _ O JL 

(l-a- 


* AlliTnalive tlcsiKiiiiliims arc ihr “iiilrinhir rale f>f natural iniTeasi*,’* the “Irue 
rate of natural increase,” llie **ultiiiiatc rale of increase,” or llu* ^'natural rale of in¬ 
crease.” 

t A closely ap])roximalu value can also he oblaineil from the follow in^ consideralions. 
The net rcprorluction rale (here indicateil by Rlr—ihc primes ilisiinguishin^ these 
symbols from the R of ihe customary linear cuinpoumliii;' nutation used in the foot¬ 
note to par. Ill) is actually the ratio of the total births in two successive generations. 
'I’hc mean length, a say, of tme generation evidently is given in the table of par. U»8 by 
Ri/Ro. Since the inherent rate of natural iiicre.*ise, r, is an annual rale, it follows that 
(1 +r)« = A*«, or /?o “ <?*■“, afipro.ximately (this relatiun al.so einergjng from by 
dn)pping the first term). In the numerical cxain])le, r thus computed approximately 
from logs Ro and a only is .(M)50l in comparison with .(X).S()3 obtained from (9.^). 



The Theory of ReproductivUy 229 

In the example of par. 168 here, the value calculated by this method is 
shown to be .00563 —that is to say, the female population is there esti¬ 
mated to possess an inherent rate of natural increase which, under its 
assumed unvarying mortality and birth rates by age and if it is left to 
itself and there is no migration, ultimately (i.e., when the population has 
attained a stable age distribution) would reach .563% p.a., being an in- 
crea.se of 5.63 daughters per 1,000 female population per annum. 

175. At this stage it will be well to make clear the relationships be¬ 
tween the hypothetical population represented by the mathematical 
model examined in par. 174, which starts from any origins whatever and 
ultimately reaches a stable age distribution increasing at rate r, and the 
hypothetical life-table population of customary actuarial practice which 
starts with a fixed number of births (/o = N, say) and is stationary 
(r = 0). 

In the more general {xrpulation increasing at rate r it has been 
shown in (91) that the total population, P{() at time t, is ultimately 

Qg’’* / e'^^^rpodx . The birth rate at time t when / is large is 


Bit) 1 



which is the b independent of time already used. I'he death rate is evi¬ 
dently ft — r. In the life table, on the other hand, where r = 0 and the 
population is stationary so that time i may be disregarded, the total 

number living similarly computed is N f xpodx* ; the birth rate is 

•'0 

/V _1_. 

* 

rPodjC / Xpodx 

•'0 

and since the population is stationaiy the deaths and births are equal. 
Thus the life table is the particular case of the generalized fX)pulation 
when r = 0. 

The analogies here shown also may serve to emphasize the extent to 
which actuarial life-table procedures—-devised for comfmtational facility 
on the assumption of a stationary ix)pulalion for technical convenience 
only (cf. par. 64, footnote) have been transplanted into the wider 
demographic field in attempts to deal with the very different problems of 
reproduction theory. 

• This shows immediately, by analogy, that the Q in (91) is in fact determined by 
the origins of the population. 



230 Population Statistics and Their Compilation 

176. The inherent rate of natural increase evolved in pars. 174-75 
can be modified suitably, of course, if the mathematical model employed 
is elaborated in accordance with the ideas discussed in pars. l72(0-(v) 
and 173. Thus Pollard in J.I.A., LXXIV, 308 et scq., investigated the 
“joint rate of natural increase” on his basis of par. 172(/»7), and also gave 
the rates on the assumptions made by Karmel and by Clark and Dyne 
as noted in par. 173(0- The rates emerging from the procedures of par. 
173(/0 arc shown in Whelpton's paper. 

177. The Replacement Index ,—^All the reproduction rates and inherent 
rates of natural increase discussed in this Section require the use of birth 
rates by age (and sometimes also by durations of marriage). If those 
birth rates are not available (as might be the case for small groups of the 
]X)pulation), a simple ReplacemefU Index based only on the given popula¬ 
tion in age groups and a suitable life table has been suggested. The 
calculation is made by taking the ratio of the female children in a given 
age group in the actual population to the women in that population who 
would have been in the reproductive age period when the children were 
bom, and dividing by the corresponding ratio from the life table. This 
rough method (which was suggested by W. S. Thompson, and has been 
analyzed by A. J. Lotka in J.A.S.A., XXXT, 273), was found by A. 11. 
Pollard to give values which move closely parallel to the net reproduction 
rate (see J.I.A., LXXIV, 289, 314, and 317). 

178. In pars. 170-71 the limitations surrounding the theory of re¬ 
production rates were emphasized, and from the development showm in 
par. 174 it will be apparent that the same basic theory underlies both 
reproduction rates and the inherent rate of natural increase. The funda¬ 
mental concepts of both measures, therefore, are equally vulnerable. Al¬ 
though the theory must, it seems, be included in this Study as an in¬ 
teresting application of actuarial technique demanding alert critical 
faculties, it is almost impossible for these rates to escape the charge that 
they are fatally unrealistic, and that consequently (as already observed 
in par. 171) their basic theories and numerical values cannot displace 
detailed analyses of the actual mortality, marriage, and birth rates by 
ages and durations in any examination of the cumulative trends in deaths, 
marriages, births, and the resulting population. 



XV 


SICKNESS DATA 

179. The statistician’s interest in records of sickness (or “morbidity”) 
is a necessary consequence of his study of statistics of the previous oc¬ 
currence of birth and the subsequent happening of death. 'Fhe possibility 
of statistical analysis of the various types of sickness, however, is limited 
to those ailments which arc sufficiently serious to require the attention of 
a doctor, for in other cases no record of the sickness could be obtained; 
and even in cases requiring medical attention it has usually been con¬ 
sidered practicable to require official record only of those diseases which 
either are infectious or indicate the existence of occupational conditions 
which it is desirable to remedy. 

180. Sickness data in rcsj^ct of the general population may be col¬ 
lected cither (1) by census methods of enumeration, or (2) by continuous 
registration. The first of these methods cannot be depended ujxm to give 
any reliable data with regard to durations of illnesses -for, as in the case 
of the attempts to collect death records on the census schedules (sec par. 
17, second footnote), the information usually would be obtained by 
enumerators without any medical background, aivl would be supplied to 
them by the persons concerned, or by third parties, who frequently would 
have poor memories as well as defective knowledge. A census enumeration 
of sickness, moreover, can give only a measure of the “prevalence” of 
sickness at the date of the enquiry; and although the data may of course 
be stated according to t 3 q)c of sickness, age, sex, occujxition, etc., such 
important statistical information as the number of days of disability f)er 
person exposed can only be estimated roughly (cf. Sir Alfred Watson’s 
remarks in J.R.S.S., XC, 467). Historical instances of the use of this 
census method arc to be found in the ciuestions relating to sickness which 
were included in the 1880 and 1890 United States census fjopulation 
schedules. More recently the National Health Survey of 1935-36 by the 
U.S. Public Health Service was conducted on lines which followed census 
methods. 

181. The collection of sickness data by (2), continuous registration, 
originated in the necessity for public sui^r\Msion over certain dangerous 
infectious diseases, such as smallpox, and has been extended to include 
other infectious and occuimtional maladies which also are preventable. 
The problem has been considered repeatedly by the British and American 


231 



232 PopfdatimfiJStatistics atid Their Compilation 

Medical Associations, and by other bodies, since about 1855. The first 
experiments were of a voluntary or semi-voluntary character—^Massachu¬ 
setts and Michigan in the United States inaugurating plans for voluntary 
notification of certain diseases in 1874 and 1876 respectively, while in 
England local Acts providing for compulsory notification in those 
localities willing to do so were placed in operation in Huddersfield in 1876 
and Bolton in 1877. In the United States, where such matters arc under 
the control of each individual State, the first system which was com- 
pulsoiy and at the same time comprehensive seems to have been that 
adopted by Michigan in 1883. In England the Infectious Diseases 
(Notification) Act of 1889, which empowered any sanitary officer to en¬ 
force compulsory notification, was extended in 1899 and placed in oper¬ 
ation over the entire country (see Ncwsholme’s “Vital Statistics,” pp. 123 
et scq., and J. W. Trask’s Supplement No. 12, U.S. Public Health Rc- 
{xirts, 1914). 

The systems thus inaugurated have been continually improved. In 
Britain weekly returns of the notifiable diseases reported to them have 
been made for many years by the local medical officers of health -these 
returns being tabulated and published in the Registrar-Generars Quarter¬ 
ly Reports, and summarized and intcq)rcted in tlic Annual Re|:)orts. 'fhe 
clearing system used in the United States was established in 1906 with the 
U.S. Public Health Service as the central agency, through which weekly 
Public Health Rei3orts are published. 

182. Since the value of the periodical returns in the United States 
depends largely upon uniformity of procedure and adequate enforcement 
by the individual States, a “Model State Law for Morbidity Rcjx)rts” 
was approved in 1913 by the Public Health Service and a number of 
individual States. In addition to a list of notifiable diseases, and regula¬ 
tions providing that the original re{X)rts shall be made by the attending 
physician to the local health officer and thence to the State Department of 
Health, a “standard notification blank” was adopted calling for informa¬ 
tion by the physician as to (a) Name of disease, (6) Patient’s name, age, 
sex, color, and address, (c) Occupation, (r/) School attended by or place of 
employment of patient, (g) Number of persons, adults, and children in 
household, (/) Physician’s opinion of probable source of infection or 
origin of disease, (g) If smallpox, whether mild or virulent, and number of 
times and dates of vaccination, and (//) If typhoid fever, scarlet fever, 
diphtheria, or septic sore throat, whether patient or any member of 
household was engaged in the production or handling of milk (sec 
J. W. Trask, op. cit., p. 45). 

A standard diagnosis code for use in tabulating morbidity statistics. 



Sickness Data 


233 


with the intention that it should be linked closely to the 1938 revision of 
the International List of Causes of Death, was also devcloix?d in the 
United States Public Health Service and the Bureau of the Census in 
1940 (see Vital Statistics Special Reports, XIT, No. 6) on the basis of a 
system of codification which had been published by J. Berkson in 1936 
(American Journal of Public Health, XXVI, 606, and Proceedings, Staff 
Meeting, Mayo Clinic, XI, 396). After revision it was embodied in a 
Manual for Coding Causes of Illness. A further important advance has 
now been made by the promulgation of llie 1048 revision of the Inter¬ 
national List, under its new name as the International Statistical Classifi¬ 
cation of Diseases, Injuries, and Causes of Death (see par. 146 here), by 
which for the first time a single list has become available for the uniform 
classification of morbidity as well as mortality. 

183. Where notification is properly enforced in conjunction willi 
uniform coding and tabulation, valuable statistics as to the prevalence 
and incidence of various lyj)cs of sickness can be compiled. Rates of sick¬ 
ness, expressing the number of cases for each disease in each |)ofnilation 
group, may be stated by age, sex, occupation, etc., where the numbers are 
sufficiently large, and the corresponding “crude” rates without distinction 
of sex and age may also sometimes give a fair indication of the healtliiness 
of the community (although such crude rates are, as always, subject to 
the limitations pointed out in par. 138). It will be seen, however, that the 
data are not generally available for the computation of rates of sickness 
according to duration. 



XVI 


CONCLUSION 

184. This Study is concerned primarily with the methods of compiling 
the various tyi^es of |X)pulation statistics which are of value to the ac¬ 
tuary. It is thus intended to deal only incidentally with the subsequent 
interpretation of the statistics so prcpareil, and it excludes consideration 
of many other allied problems which arc more particularly interesting to 
the general statistician. These further questions are treated extensively 
in a number of the works mentioned in this Study, and also in many addi¬ 
tional references—both in English and other languages—which will be 
found in them. The student is therefore referred to those sources for any 
further information which he may require in subsequent research work, 
and particularly to the reports of the Registrar-General of England and 
Wales and the Bureau of the Census and the National Office of Vital 
Statistics in the United States, as well as the Journals of the Royal 
Statistical Society and the American Statistical Association, and the 
periodicals entitled ‘‘Population Studies” and “Population Index.” 


234 



APPENDIX 




SOME THEORY IN THE SAMPLING OF 
HUMAN POPULATIONS 

By W. Edwabds Dkming, Pu.D. 

1. The Reasons for Sampling.- -Complete population censuses, in 
which appropriate questions are asked in respect of every individual, are 
large and ponderous; they employ many people; thqr are therefore costly. 
Much time is required to collect the information, and thence to tabulate 
the statistical data in vast hand and machine operations before the ma¬ 
terial is ready for release. Through the use of proper statistical techniques, 
a sample selection, instead of a complete count, can often be made with 
notable economies in effort, time, and cost, and generally with more 
efficient control. It must be remembered, however, that a sample docs not 
give data advantageously for very small areas- or to put it in another 
way, the most economical sample for small areas is often one of KXt'X' 

From the results of a sample surv^, estimates arc jrrqrared which arc 
expected to give approximations in varying degrees to tlie results that 
would have been obtained if a complete count had been taken with the 
same care and the same definitions. The degree of approximation is mea.s- 
ured by the “standard error” of the estimate. Rq)Ctition of a sample will 
yield a distribution of estimates. 

2. Uses of Sample Surveys .—Sample surveys are used in connection 
with population statistics mainly for the following purposes: 

(1) To obtain counts and characteristics of the impulation without re¬ 
course to a complete census. For example, sam]>lc censuses of nine cities 
in the United States were taken in 1944 (see “A Chapter in Population 
Sampling,” Bureau of the Census, 1947). A more remarkable achievement 
was the sample censuses of population and agriculture in Northern 
Rhodesia and Southern Rhodesia, without benefit of a prior complete 
census. 

(2) To obtain additional characteristics of the i)opulatiun when the 
frame* is furnished by a concurrent complete census or registration. 
Examples are: (o) In the U.S. jxipulation census of 1940, supplementary 
information was obtained from a 5% sample of the inhabitants, and in the 
1950 census samples of 20% and 3J% were used (as described in jKir. 12 

• “Frame” Ls explained in the section dealing with definitions. For the moment it 
may be regarded as a list of the population, area by area, dweiiing unit by dweiiing unit, 
or even name by name. 


237 



238 


PopfdaHon Statistics and Their Compilation 

of this volume), thus broadening the scope of the census at little expense; 
and further economics were cfTccted by tabulating much of the regular 
census information for a sample only. (6) In the Swedish extraordinary 
census of 1936, information was obtained by direct interrogation from a 
sample of names drawn from the national register, (c) Samidcs of areas are 
used to test the com])leteness of birth and death registrations (as noted by 
C. Chandrasekar and W. E. Deming, J.A.S.A., XLIV, 101; see also par. 
46 of this volume). 

(3) To obtain characteristics of the |>opulation without a concurrent 
census. Thus the Current Population SurvQr of the United States, initi¬ 
aled in 1939, gives monthly ilgurcs on employment and unemployment by 
age-groups, hours worked in agricultural and non-agricultural pursuits, 
and numerous other regular and occasional details. A similar example is 
furnished by the quarterly survey in Canada, initiated in 1945. 

3. Some of the Uses oj Sampling in Comiection with Censuses of Popular 
tion.- -The next step is to enumerate and to discuss some of the ways in 
which sam|)ling is used in connection with a complete census: 

(а) In connection with a complete census, to ask certain questions of 
only a sample of the |)oi)ulation. (Aim: to broaden the scope of the census 
at low cost, without proportionate increase in the burden of response or in 
tlie time required for carrying out the work.) 

(б) To tabulate a sample of a complete census. (Aim: to hasten the 
results for large areas; to save time and expense in the production of some 
of the tables that would ordinarily be produced from the complete 
census; to broaden the scope of the tabulations.) 

(r) 'Fo collect information from only a sample of areas or of other 
units. (Aim: to rqdace a complete census by a sample census; to gain 
speed; to decrease the cost or to provide information under conditions 
that render a complete census either unnecessary or impossible.) 

(c/) To control the quality of the coding, punching, and tabulating, at 
various stages of the processing. (Aim: to investigate the eficcts of such 
errors on the published tables; to control and to improve the quality of 
the finished product; to save time and money in the completion of the 
work.) 

(e) To investigate the quality and the meaning of the figures obtained 
in a census. (Aim: in samples of areas, selected with the aid of statistical 
theory, to carry out studies on completeness of coverage, over- and under¬ 
enumeration, misunderstandings of the questionnaire, differences between 
interviewers; to test various methods of obtaining the information.) 

4. Definitions, —The frame is a list (or file) of areas, farms, households, 
people, or of business establishments that would be covered in a complete 
count (a 100% sample). The sample is drawn from the frame. If the frame 



Appendix 239 

fails to cover certain regions that arc supposedly in the study, then both 
a complete count and a sample will be in error, by about the same amount, 
and for the same reason. A frame is the first step, for either a sample or a 
complete count. 

Each member of the frame (each line, or each card) is a sampling unit. 
Each sampling unit must have a definite probability of being (Irawn into 
the sample; it is this requirement that produces a probability sample -hy 
definition one whose standard error can be calculated. 

An estimate is a number calculated from the results of the sample which 
is exi)ectcd to give an approximation, with calculiiblc sampling error, to 
the result that would have been obtained for the universe if the sample 
had been total. The procedure of calculating an estimate must be included 
as part of the sample-design. "Phe degree of approximation of an estimate 
is measurable by its standard error. 

The standard error of an estimate is a measure of precision. It is the 
standard deviation of all the |M)ssible estimates that may be formed from 
a specified sampling procedure. 'Fo be usable, a sampling procedure must 
be one for which the standard error is calculable from the results of a 
sample. A standard error is not calculated from outside comparisons (vide 
infra). 

The bias of a procedure is a bias of the method, that is to say, of the 
definitions used, the form of questionnaire, the methorl of canvass (e.g., 
whether mail or interview), the training given to the interviewers, the 
procedure for selecting them, or of the formula of estimate. Bias, whether 
the survey be a complete count or a sample, is detectable and measurable 
only by comparing different methods in an aj)provcd experimental design. 

The population of a sample-unit or of the frame is the number of 
people in it conforming to a prescribed characteristic. Tn the ordinary 
sense, the “population” of a city is its number of inhabitants, i.e., the 
number of j)eople therein possessing the characteristic of being alive at 
midnight or at noon of a prescribed date. But a city or any sampling unit 
has also many other pojmlations such as the number of males, the number 
of females, the number of children 10-14 in school, the number of em¬ 
ployed males 20-29, the number of births or of deaths last month, etc. 
The symbol fli will be used to denote the population of sampling unit i in 
the universe, and the subscript i will lake the values 1, 2,. . . , A for the 
sampling units into which the frame will be divided. Pi will denote the 
probability of a* being drawn. (In the theory of multi-stage sampling, two 
indexes will be needed to form symbols such as an). 

The sample design is the blue-print of the procedure for drawing the 
sampling units into the sample and for forming the estimates. 

5. The Aim of Sample Sample design requires the adaptation 



240 Population Statistics and Their Compilation 

of existing mathematical theory to the available facilities and administra¬ 
tive restrictions, and the development of new theory and new facilities 
when advantageous. The aims in modern sample design arc these: 

(0 To decide what precision is desirable in view of the probable costs 
and uses of the data. 

(if) To meet this ])recision by laying out efliicient, workable, and foolproof 
mathematical procedures for (a) drawing the sampling units into the 
sample; (A) calculating the estimates to be made; (c) calculating the 
precisions of these estimates; {d) measuring any biases or differences 
in need of measurement. 

The greater the accuracy demanded, the greater tlic cost of the 
survey. In all surveys (both complete counts and samples) uncer¬ 
tainties arise from many sources (interviewer bias, lack of clarity of 
definitions, non-response, etc.*), and it is wasteful to refine the 
sampling error too far in view of these other errors. Moreover, as data 
arc needed as a basis for action, there is always a limiting ]>rccision 
beyond which the action would not be affected; in the planning of 
any survey, therefore, a certain “aimed at” precision will be specified 
which might be, for instance, a 1% or perhaps 3% sampling error in 
a population count, 10% in an inventory of wheat, 20% in a survey 
of housing characteristics, etc. 

In the planning of a survey for the first time the “aimed at” 
precision can lltenally only be aimed at because some of the necessary 
constants in the appropriate formula for the sampling error will 
usually be known only approximately before the survey is taken. In 
a series of repeated surveys, however, the accumulated exi>erience 
enables the cost of the surveys to be lowered and the precision to be 
adjusted closely to the requirements. 

{Hi) To appraise the precision that was actually obtained, after the survey 
is completed. 

The constants in the formula for the sampling error, which were needed 
in the jdanning of the survey, may be estimated from the returns of the 
survey with some firmness, and with these constants the so-called “stand¬ 
ard errors” of the estimates made from the survey can be calculated.f 
Further information concerning the reliability of the survey may be 
gained by comparing its results with other surveys and studying any 

* i\ parlial list of sources of uncertainties, witli comments, was published by W. K. 
Demiiif' in the American Sociological Review, IX (1944), 359 69. This paper, revised, 
now forms Chapter 2 in his book “Some 'Iheory of Sampling” (1950). 

fFor examples of appraisal see “A Chapter in Population Sampling,” p. 14, and 
Chapters 11 and 12 in Deming’s “Some Theory of Sampling.” 



Appendix 241 

sif'nificant differences. If any supplementary or simultaneous experiments 
were conducted in order to make tests of coverage and definitions, or to 
measure differences between interviewers, or between different versions 
of the questionnaire, these experiments sboulil be summarized, and their 
bearing on the reliability of the results of the survey should be carefully 
explained. 

6. Random Variables', Random Numbers. A random variable is a 
number produced by a random operation. Empirically it is possible, with 
care, to simulate a random operation. The calculated distribution of a 
random variable is thus to be regarded as a prediction of a real distribu¬ 
tion which may be obtained empirically under certain conditions. Satis¬ 
factory simulation of the random operation of drawing a sampling unit, 
giving all units equal probabilities, is realized in practice by the use of 
random numbers for instance, L. II. C. I'ippett’s “Random Sainjiliiig 
Numbers," or R. A. Fisher and F. Yates’s “Statistical Tables for Bio¬ 
logical, Agricultural, and Medic«al Research." Tn some of the descriptions 
that follow it will be convenient to speak of the random operation of 
drawing Siimpling units as the drawing of chips from a bowl containing N 
physically similar chips, each marked to identify a one-to-one correspond¬ 
ence between it and a particular sampling unit of the frame to be sampled. 

If the N sampling units arc listed on N lines or cards in any convenient 
order, and numbered serially from 1 to A, the act of reading out a random 
number between 1 and N gives all N sampling units equal probability, and 
designates a particular sam])ling unit for the samj>lc. 'riic drawing of n 
random numbers produces a siimplc of n units. 

'rhe mean of the frame will be 

( 1 ) 

1 

and its variance will be 

< 7 *= J.-V ( 2 ) 

Some of the at values may be equal. Let them then be grouped into M 
different classes which fall at 


with proportions 
In this case 


Sli J2» S3, • . . , C.i/ 

PU pit Pit ••• tpM^ 



( 3 ) 



242 

and 


Population Statistics and Their Compilation 


y ^t(gi — m)* 

1 

= — (4) 

1 

The square rool of the variance of a universe is its standard deviation, 
denoted by a, 

A useful measure of a distribution of which the mean is not near 0 is 
the coeffi-ciefit of variation, y, defined as the standard deviation of the 
universe measured in units of the mean, so that 



Let a sample of n units he drawn by a random operation, and let the n 
values of a,- be recorded in the order drawn. Any one sample will appear as 

^2> • • • I • 

Then—to ^ive a number of examples—the following functions of the sam¬ 
ple arc all random variables— 

^ 2 ; JCi*; -*^ 1 + 2 , for instance; 

(-Vi + .V2 + . . . + .v„) = x\ — (.V? + .V2 + . . . + -vj) ; 

the median; the range; the maximum; the minimum. 

Uix)n restoring the drawings, either one at a time as drawn, or after 
the whole sample is drawn, a new sample and a new set of random vari¬ 
ables may be produced. The functions listed above will vaiy from sample 
to sample in a random manner; this is why they are random variables. 

7. Fundamental Theorems. —Mathematical manipulations of the P- 
values of the chips in the bowl will lead to the distributions of random vari¬ 
ables such as those already listed. Thus,* the mean of the distribution of x 

* It is presumed that the student understands the use of the operator E which in 
the theory of sampling denotes a mathematical average, 'llie operator is commutative: 

£(* + >) =Ex+Ey, ^£.r.. 

I 1 



Appendix 


243 


will be 


1 r \ 

= -[ Pi a i + similar expressions lo n termsj 

N 


(b) 


If all the Pi are equal lo 1/A’, i.e., if all N have the same probability of 
beitiK drawn, this reduces to 

Rx = *** = -^? = t*. (7) 

.V 

-1 bein^ defined a,-, llic lotal i>opulation of all sampling units con- 

I 

stituting the frame, and /u being defined as the mean population per unit. 
We thus see that .v is an unbiased estimate of n when all of the N have the 
same probability of being drawn. 

'Fhe variance of the distribution of .v, on the theory that the drawings 
are made without replacements, will be 

a\^E{x-Ex)'^ 



.V — n O’® 


(«) 


The derivation of (8) is given, for instance, in Demiiig’s “Some 'Fheory 
of Sampling,” p. 101. It is to be remembered that if every chip has the 
same chance of being drawn, and Pir) denotes the probability of getting 
exactly r black chiiis. 


Fir) 

or 

P(r) 


= (y»- if the chips are drawn with replacement (9) 

= — — if the chips are drawn without replacement (10) 



244 


where 


Population Statistics and Their Compilation 
\rj f!(w-r)!’ 


the number of possible combinations of n items taken r at a time. 

Then it can be shown easily tliat, whether the drawings are made with 
or without replacement, 


and that 


or 


Er='^rP{r) =np 

(11) 

= - {Rr)* 


= npq with replacement 

(12) 

^” npq without replacement. 

(12o) 


The formulae for drawings with replacements are, of course, based on the 
supposition that the proportion black remains constant so that the “point 
binomial” (the Bernoulli series) applies. When the drawings are made with¬ 
out replacements, however, so that the proportion black dues not remain 
constant, the hypergeometric scries provides the basis as in (10). See 
Wolfenden’s “Fundamental Principles of Mathematical Statistics,” pp. 
12-13,27, and 65-66, and Deming’s “Some Theory of Sampling,” p. 121, 
for the proof of (12a) without replacement. 

The factor {N — n)/(N — 1) in (8) is called the fnite multiplier be¬ 
cause it arises from the finite size of N, It is 0 if n = N, in which case the 
sample is complete, and there is no error in x at all; it is 1 if » = 1, and 
approaches 1 as ^ —> <». It is often written in the appro.ximatc form 
1 — n/N. Clearly, if the sample is small, for example if n/N = 5%, ex¬ 
pansion of the frame to double, treble, or 10 times its i>resent size N will 
have little effect on the variance of x, 'Phus it will be realized that a sample 
of 1,000 families drawn from a city containing 20,000 families is only in¬ 
significantly more reliable than a sample of 1,000 families from the entire 
45,000,000 families of the United States. The important measure of the 
size of a sample is its absolute size, n, not its percentage of N. 

It follows that a survey designed to give regional data for 9 gcogra])hic 
regions is almost 9 times as expensive as a survey designed to give data 
only for the country as a whole. Demands for local data must thus be 
considered not only with regard to the need for such data, but with regard 
to costs. 

The square root Oi of the variance a| of x is known as the statidard error 
of the estimate x. More precisely it is the standard error of the particular 
sampling procedure by which x is produced. The standard error is an im¬ 
portant measure of the variability of an estimate because 3 standard 
errors measured each side of the mean of any ordinary distribution will 




Appendix 245 

contain practically all of the distribution; hence a sampling error is 
practically never observed outside the ranse Ex ± where Ex is the 
mean of the distribution of the random variable x, and vx is the standard 
error of the estimating procedure. The range ± 2<r* contains about 
95% of the distribution of *. (See Wolfcndcn’s “Fundamental Principles 
of Mathematical Statistics,” p. 20 .) 

Suppose that a random variable X be defined as 

X = Xx. (13) 

Then 

EX = Nn = A. (14) 


Thus X is an unbiased estimate of the total population .1 of the frame. 

Since the mean square error (variance) of a multi|)lc, when c is a con¬ 
stant, is given by the relation (sec Wolfenden, op. fit., p. 23) 


it follows from ( 8 ) that 


a- — C'-a- 



n 

I n • 


(15) 

(16) 


A^?ain adopting the convenienc e of the coeflic'icnt of variation, we may 
divide ( 8 ) by /i* and (16) by (iV^)* and obtain 


C 


X 


C - = 



!L 

\ y/n 



ixj V/i 


wluMi is small 

iV 


(17) 


where C.v and Ci denote the coenicients of variation of this j)r(jcediire for 
estimating -V and x, and 7 is the coeH'icient of variation (r/fi of the frame. 
This is a very useful and convenient form for t he variances of I he estimates. 

Samples drawn with pro])cr iirecaiition with random numl)crs will be 
found to kIvc results in close confomiit}' with these equations. 

A case of great practical importance is the frame of 2 cells, which is use¬ 
ful when the sampling units can he classed as black and white, or vacant 
and not vacant, or passed and rejected, etc. In this case we shall assume 
that 

Xq are labeled </,■ = 0 
Np are labeled </,■ = 1 . 


The mean of this frame is 


= p7 


( 18 ) 



246 Population Statistics and Their Compilatiofi 

and its variance is 

(19) 


as may be derived directly from (3) and (4). 

Let a sample of n be drawn from a 2-celled frame, and let r be tlie num¬ 
ber of black chips in the sample. Each black chip is to count 1, and each 
white chip 0. Then the symbol which was Jt* is now r/n and we sliall let 



( 20 ) 


It follows from (7) and ( 8 ) that if all the chips have equal ])robabilities, 


Ep=^p 

and that 

2 lY — n pq 

O' * — - - . 

\ -In 


( 21 ) 

( 22 ) 


'rhus p or r/n is an unbiased estimate of p^ and the variance o| of this 
estimate has tlie value just written. 

8 . Sinjilc-Stage Sampling. The expressions just derived are those for 
sin)[,de-staKe siimidin^, in wliich each of the N sam])lin.e units has the same 
])robability as another of bcinj^ drawn. A frequently occurring application 
is the samplinj^ of cases from a file of cards, or of dwelling units, farms, 
etc., from a list or nui]). Sometimes the frame (i.e., file, map, list, etc.) 
will already be in existence; in other cases it must be made or broufiht up 
to date. In saintdin^ a small city for a count of the population and 
estimation of the number of people and families havinji; particular char¬ 
acteristics, a complete listinj^ of dwelliii" units over the entire city mij'ht 
be the initial step, to be followed by interviews at the n dwellinj^ units 
falling into the sample, which can easily be drawn from the listing. 

Suppose, for instance, that 7 in (17) has the value 0.7 for the number 
of people per dwelling unit in some particular city in which a sample of 
dwelling units arc to be interviewed with the aim of estimating the 
population of the city and the sizes of various classes thereof. Su|)pose 
that this is to be a sur\"ey of great j)rccision, in which the standartl error 
of the estimated number of inhabitants is to be V][. 'riienrA' in (17) is 
to be .01. If iV = 50,(XX) dwelling units, the arithmetic shows that the size 
of sample should be about 4,‘XX) dwelling units, or about 1 dwelling unit 
in 10. To attain a standard error of a sanqde of only about 1,200 
dwelling units would suillce. N here is so large that it has little effec't; the 
sample sizes would be about the same for a city of a million dwelling units. 

9. Systematic or Patterned Sampling .—In the ssinipling of human po]>u- 
lations some people have used a ‘‘systematic or patterned” selection in 



A ppefidix 247 

place of n random drawings. For a 1 in 10 systematic selection (a “deci¬ 
mating” sample) the procedure would be to start with a random number 
between 1 and 10, and to take this and every lOlh dwelling unit thereafter 
from the list of dwelling units. Straight systematic selections arc now 
being replaced gradually by modifications like the Tukey plan (infra). 

If the listing of the N dwelling units were randomized to start with, a 
systematic selection of n units from the list would jHissess no special 
characteristics to distinguish it from a sample consisting of n independent 
random selections in which a random number is read out for each unit as 
it is drawn into the sample. However, in the process of listing, a lister 
will start at one corner of an area and proceed in a systematic manner 
up and down tlic streets, roads, and corridors until the job is iinished: if 
he does not make a systematic coverage, he will soon be lost and will list 
some units twice and others not at all, A map showing small areas num¬ 
bered in serpentine fashion witliin a country, county, or city is, in effect, 
a systematic list, A Held or forest is a systematic list of rows, hands, or 
lines, numbered sermlly east to west or north to south. Successive units in 
most frames show some sort of serial correlation, Tlic character of a 
systematic sample thus arises from the systematic layout of the frame, 
and not alone from the sampling. 

The variance of an estimate formed from a systematic selection* will 
for some materials be much smaller, and for other materials much larger, 
than the variance of an estimate formed from a sample in which all » 
units thereof were drawn separately at random, i,e,, an “independent ran¬ 
dom sam])lc,” In the sampling of human ])opulations, however, the 
systematic and independent random procedures usually give about the 
same variances,! 

In the sampling of manufactured articles and equijiment, intervals of 
5 , 10, and multiples thereof should be avoided in a systematic selection 
because of the possibility of periodicities that may play havoc with the 
equations for the variance. The strength of modern sampling lies in 
computable standard errors, and a method must not be used for which 
the equations for variance are no longer at least approximately applicable. 
In case of doubt, a systematic or patterned selection should be avoided. 

''lo ])rescrvc the advantages of geograj)hic stratification which a 

•'I hu fi)riniiliic for ihe variance of syslcmalir sampling, which arc very coinpliralcd, 
have heen <lerivefl recently by W. G. ami L. Madow (“On the 'Pheory of .Systematic 
Sami)ling, f,” Ann. Math. Stat., XV, 1-24;. 

tSec J. G. Osborne, “On the I’retision of Kstimates from Systoriiatic versus Ran¬ 
dom Samples,” Science, XCIV, 584 85, and “Sampling Krrors of Systematic and Ran¬ 
dom Surveys of Cover-Type Areas,” J.A.S.A., XXX\11, 256-64. 



248 Population Statistics and Tlteir Compilation 

systematic selection ^ives, it is simple to divide the frame into \n succes¬ 
sive equal or approximately equal groups, and then to take two random 
selections from each group. This is always a safe method, and is only 
slightly more difTicult to administer than a systematic selection straight 
through the frame. The Tukey jdan (vide infra, par. 10) preserves both 
the simplicity of a patterned selection and the validity and the simplicity 
of the formulae for random selections. 

'rhe statistician must balance the various advantages and disadvan¬ 
tages against each other. Considerable knowledge regarding the statistical 
characteristics of a material is necessary before one can be sure of pre¬ 
scribing a highly eflicient procedure. Tt is much more imix>rtant, and 
fortunately much simpler, to prescribe a ]>rocedure that is statistically 
valid—i.e., one tliat is unbiased, or nearly so, and whose standard error is 
computable from the sample itself. 

10. The Appraisal of Precision. In the planning of a survey it may 
have been assumed, for example, on the basis of experience or from a pilot 
study that y in (17) for a particular characteristic is about 0.7. After 
the sanqile of dwelling units has been interviewed it is possible to estimate 
y very closely, and hence to estimate the standard errors of -V and .r very 
closely also. Thus, whether y is near 0.7 or not, the standard errors of the 
results will in the end be known, and a corrected value of y will be ob¬ 
tained for more economic ])lanniiig of the next survey of similar nature. 

As 7 may be expected to vary from one city to another, the value ob¬ 
tained for one city should not be used in another c.xcept as an ai)proxi- 
mation. In practice we should take whatever values of y were encountered 
in previous experience witli similar materials and by speculation raise or 
lower such values depending on known sociological conditions. Thus, in a 
city where there is much doubling of families, or a tendency toward 
large families, y may well be 25% higher than in some other city; such 
conditions would call for a 50^^ bigger sample to meet the same precision. 
Often in the planning stage it is advisable to take a preliminary sample on 
a small scale to get a good estimate of y and to try out the questionnaire 
and instructions, and to provide necessary practical experience. 

(a) Once the interviewing is completed, an estimate of y may be made 
by drawing at random a subsample n' of the returns and performing cal¬ 
culations thereon to corresiK)nd with the following equations:* 

a- (estimated) = - (jf.- — x') *, (23) 


• See Deming’s “Some Theory of Sampling,” p. 3.^3. 



where 


Appendix 


249 


1 

= (24) 

and Xi is the population on the itli return drawn into the subsample 
(j = 1, 2,. .. , «')• In practice {N - \)/N may often be replaced by 
unity. The number n' should be in the neighborhood of 100. a, once esti¬ 
mated, is then used in (8) or (17) to appraise the precision actually at¬ 
tained. Tt is to be observed that this appraisal is made from the sam]>le 
itself. The design of the sample must therefore be followed rigidly in the 
field, for otherwise such calculations may be misleading. 

(A) Another way is to plan the sample as 10 systematic or {Kittemed 
subsamples with 10 random starts, and to compute the results from each. 
This is the Tukqr plan (see Deming, op. cil.). 'Phe variability between the 
10 subsamples gives measures of the standard errors of the results. Let 

.Yi'^..Y(’") be the 10 estimates of the (xipulation having some 

]>arlicular characteristic, and let 

.Y=-iVfY">-t-A't«-f...-l-.Yt"')]; 

then 

10 I — o 

Kstimated i S I • 

I '■ ^ 

The Scini])lin;< and tabulation plans may be so laid out that the identity of 
the to subsamples is maintained, and estimates of the variances of the 
chief characteristics are obtained automatically, with their f)recisions. 

Before prescribing the Tukey plan, one should make sure that there 
will not be any serious loss in efficiency from the use of wide strata, and 
that the cost of tabulating the 10 subsamples will not be too great. 

(r) Tf a systematic sample was not laid out in independent subsamples, 
the variance of any estimate .V may still be approximated. Imagine loops 
to be thrown around successive pairs of dwelling units in the sjimple. 
'rhese loops form hidden strata in the frame, created by the systematic 
procedure of listing and sampling. The two dwelling units of a pair have 
been drawn from one of these hidden strata. Kach pair thus gives a slight 
over-estimate of a® for the stratum whence they were drawn. In practice 
one may form n' hidden strata. The equation for estimating the average 
cr* is then (sec Deming, ofi. ciL, p. 333) 

Estimated Average a- = i 


(26) 



250 


Popidaiion Statistics awl Their Compilation 

where Ri is the ran^e between the two values of population constituting 
the ith pair of returns (f = 1, 2,. . . , n'). Although this device gives an 
over-estimate of it is in practice extremely helpful where no provision 
was made beforehand for an unbiased estimate of the standard error. 
n' should usually be between 50 and 100. The work should of course be so 
laid out that variances of several characteristics are obtained at once. 

It should be remembered that the calculation of a standard error (com¬ 
monly called the “sampling” error) gives a measure only of the precision 
of the sampling, and not of the biases inherent in the definitions, in the 
errors of response and non-response, in the procedures of interviewing, the 
hiring and training of the interviewers, failure to cover the sample areas 
com])lctely or to go beyond bounds, plus other sources of error. Measure¬ 
ment of the non-sampling errors is important and requires supplementary 
samples especially designed for the puqjose. The presentation of the re¬ 
sults of a survey should contain measures both of the precision (standard 
errors) of the figures of cliief interest and of the non-sampling errors as 
well if any experiments have been included to measure these errors. In 
regard to the non-sampling errors, the presentation of the results should at 
least include a copy of the questionnaire, information regarding the 
amount of non-response, and any spaial difficulties encountered in pro¬ 
curing the information or following the instructions. Such information is 
essential in the proper use of the results. 

11. Two-Stage Sampling.—The cost of a sample census taken in one 
stage may be represented ap])roximately by the equation 

k, = kiN + k 2 n + tn , (27) 

where k, is the total cost, ki the average cost of listing a dwelling unit, kt 
the average cost of interviewing the peo])le in a dwelling unit, and / is the 
average cost of processing and tabulating the schedule or schedules from 
a dwelling unit. 'Fhe cost of a complete count would be 

kc^-kzN + tN, (28) 

where is the cost of interviewing the people in a dwelling unit when the 
coverage is complete. Obviously the relation between and kz will depend 
on how thin the sample is: if the sample is 100%, hz and kz are equal; if 
the sample is thin, such as 1:20, kz will perhaps treble because of the 
increased amount of travel between interviews. It might be, for example, 
that a rough relation is 


‘-‘•I'+tk! 


( 29 ) 



Appendix 251 

which will serve for speculation. The ratio N\n will be computed from 
( 12 ) or (17), whereupon the last equation or any other approximate 
relation between and will show that the costs h, and kr for sample and 
complete count become equal for small cities. Naturally the com]>lele 
count would be preferred in cities of this size and below, as the complete 
count does not require a sample design and it gives full details of the 
population by small areas within the city. 

On the other hand, as the size of the city increases, the cost kiN for 
listing alone runs far ahead of the actual cost {k% + t)n of interviewing 
and tabulating the sample; hence for very large cities, most of the total 
cost would be spent for listing. This is uneconomical, and a belter way can 
be found. 

One solution lies in 2-stage sampling. The city is first divided into M 
primary units or districts. A sample of m of these districts is drawn at 
random, and dwelling units are listeil in these m districts only. These 
dwelling units form secondary within the primary units. A sample of 
<lwelling units is drawn from the lists, and these dwelling units furnish 
the infonnation. The advantage of such a plan lies in the fact that no 
listing need be performed in the M — m districts not drawn into the 
sample. 

In the sampling of a city, a block or a combination of blocks forms an 
excellent primary unit. The sccondaiy units may be single dwelling units 
as used in the above illustration, or they may be small areas or “clusters*' 
of 2,3, or 6 consecutive dwelling units drawn from a map. In I he sampling 
of a region or of the whole country, the county or a combination of coun¬ 
ties forms an excellent primary unit, within which there will be secondary 
units, and usually tertiary and still smaller units. 

In 2-stage sampling, the estimate of /I, the total population having a 
specified characteristic, will be 



where 

M is the total number of primary units (to be spoken of as districts) 
m is the number of these districts drawn into the sample 
Nils the number of dwelling units listed in the ilh district {i = I, 2 , 
...,#«) 

Hi is the number of dwelling units drawn into the sample from this district 
Xij is the population in the 7 th dwelling unit of this district ( 7 = 1 , 2 ,..., 
ni) 



252 


Popidation Statistics and Their Compilation 

If all Af districts are drawn with equal probabilities, and if all 
dwelling units in the tth district are drawn with equal probabilities, then 
it follows that 

EX = /1. (31) 


That is, X is an unbiased estimate of /I. In practice, Ni/m is often the 
same for all districts for ease in tabulation, in which case 

X = “ — X (the total population of the sample) (32) 


wherein N/n is written for the constant value of Ni/ui, 
The variance of the estimate X will be* 


Af—m ..my^/NiYNi — ni 


where 


oj is the variance between the Af districts 

a] is the variance between dwelling units within the i\h district. 


Mathematically defined. 


^ t —1 

(34) 


1-1 

(35) 


where 

Ai = the population of the ith district 





(36) 


aij= the population of the 7 th dwelling unit in the /th district 
(7 = l,2,...,iV..;/ = l,2,...,AO 

The first term of (33) arises from the fact that the districts will in 
practice not all have the same populations. The second term arises from 
the fact that the dwelling units within any district will likewise not all 

* A derivation of tliis equation is given in Chapter in Population Sampling” 
(Bureau of the Census, 1947), and in Deming’s*'Some Theory of Sampling,” Chapter 6. 



Appendix 253 

contain the same populations. The statistician will try to adjust the 
boundaries of the districts so that they have roughly the same number of 
people or dwelling units so as to minimize the variances for the chief 
characteristics that are to be studied. In particular, large units must be 
broken up or set aside for separate sampling. 

By raising or lowering m and the variance of X can be governed. 
There are many combinations of w and Hi which will produce a desired 
variance in X. Different combinations will incur different costs, however, 
and one of the chief problems of sample design is to find what combination 
of m and «*• will be most economical. 

As an illustration, we may think of a simplified situation in which the 
districts all contain the same number N of dwelling units, and in which 
the ffi are all closely equal to a,r. By setting 


(32) reduces to 


a,. = X(rb 


ci- 


M — w o-fc 1 X — n 
3/ — 1 m A' — 1 »f n ’ 


(38) 

(39) 


in which n represents the number of dwelling units to be drawn from each 
district, and Cx is the coefficient of variation of the estimate -Y. If the 
districts are small and already listed, and not too far apart, little overhead 
cost is incurred in bringing a new district into the sample, and the total 
cost will be closely expressible as 

k = k%mn , (40) 

wherein k% is the average cost of conducting an interview in one dwelling 
unit. Now obviously, as mn occurs in the denominator of the last term of 
(39), this term is constant regardless of how m and n are adjusted, pro¬ 
vided that the total number of interviews {mn) and hence the total cost 
{k) are held constant. The first term, however, decreases as m is in¬ 
creased. It follows, then, that under the assumptions made here the best 
allocation of effort is to take one interview per district, and to draw as 
many districts into the sample as funds will permit. 

If the districts are so large that each one when brought into the sample 
entails an aj)]>rcciable overhead cost of listing and supervision, then the 
cost-function will be more like 

k = k\m + k^mn . (41) 

It can be shown that the variance of X is a minimum for a given allowable 




254 Population Statistics and Their Compilation 

cost k when* 

This is the sampling interval to apply when drawing the sample of 
dwelling units for each district brought into the sample. It is to be noted 
that m and ox do not occur in this equation; hence n is independent of 
m and ax> Moreover, the variances and costs are not involved separately, 
but only in ratios. In practice, oh'-ffw is rarely as low as 1, but is often as 
high as 2, and may be 3 or 4 or higher. 

If the total cost k is fixed, the number of districts to be brought into 
the sample will be found by solving (41) for m. 

If, on the other hand, the variance of X is prescribed, say at 2% or 
some other level, m is to be found from (39), and the total cost k may then 
be predicted by (41). 

By these methods- even though the assumptions underlying this de¬ 
velopment are idealized—advance estimates of ox and of the total cost k 
may be satisfactorily adjusted in advance to the requirements and the 
budget. Without such mathematical guidance, it is easy to spend several 
times as much mon^ as necessary for the information obtained. 

The preceding equations arc applicable in a wide variety of problems. 
Aside from the sampling of human |X)pulations they are used in the 
sampling of farms, business establishments, and industrial products. 

The theory for the precision of a sample taken in two stages requires 
a lengthy discussion of the expected values of the variances between dis¬ 
tricts and between dwelling units, and cannot be included here.t The ac¬ 
tual procedure should, however, revert to a single stage, for ease in ap¬ 
praising the precision. 

Cost-functions other than those used in (27), (28), (29), (40), and (41) 
will occasionally be found useful for special circumstances.t 

Extension of sampling designs to three or more stages involves no new 
difficulty, but will not be included here. 

12. Calibration Samples. —A very important use of sampling is to 
calibrate a previous set of measurements. Suppose that the M districts had 

* Kq. 42 was derived by Shewhart and by I'ipfiett, both in 1931. Its importance, 
however, only became obvious after 1942 through the work of Morris H. Hansen and 
William N. Huni^itz in the Bureau of the Census. 

t Much more general discussions will be found in “A Chapter in Population Sam¬ 
pling,” and in Chapter 7 of Deming’s book already cited. 

t Sec P. C. Mahalanobis, “On Large-Scale Sample-Surveys,” Phil. Trans. Royal 
Soc., CCXXXI B (1943), 329. Also, Hansen, Hurwitz, and Madow’s “Sample Survey 
Methods and Theory.” 



Appendix 2SS 

populations B\(i = 1, 2,..., M) st sn c&rlier dstc, and now have un¬ 
known populations Suppose, too, that and Bi are highly correlated; 
then it will suffice to measure a sample of districts, and from this sample 
to estimate satisfactorily what results would be obtained if all the M dis¬ 
tricts were measured afresh. The true calibration factor connecting the 
two sets of measurements would be 

M 

(4.1, 

V/t, 


but it is of course unknown. 

Let a sample of m districts be drawn and let the populations .1, and Bi 
therefor be listed as 

Xi, X2,. . . , 

Yu ^ 2 ,..., Y^. 

Then from this sample one may form an estimate / of 0 by writinj' 


1 

/= -. 


(44) 


Vr, 

1 

The variance of this estimate may be written* 


where 


K* 

^ M—\ m' 

(45) 

~M^\ A ) 

(46) 

II 

(47) 


In practice V must be estimated from the sample of m districts; an ap¬ 
proximate but usually veiy satisfactory formula may then be obtained by 
writini; 

M-m 1 

M X )' 




(48) 


* A proof is on p. 172 of Deming’s book. 



256 


Population Statistics and Their Compilation 

Once / is computed from (44), the total population of all ilf districts 
today is estimated as 

X=BJ, (49) 

where 

B = ^Bi (SO) 

1 

or the total |K)pulation as it was when the previous measurements were 
made. As B is a constant, and not a random variable of this sample, it 
follows that 

Cx = C,. (51) 

Any error in B constitutes a bias and not a sampling error. 

A small sample of perhajis 20 or 50 districts will often give high pre¬ 
cision in the estimates .Y or /. 

This equation is much used in population sampling wherein Yi may 
represent a population as determined by a census-taker some time ago, 
and Xi is a redetermination made by a sample. Tf the old and new meas¬ 
urements are highly correlated, V' is small and C/ may be small even 
though m is small. It is noteworthy that the old and new measurements 
may be stated in different units. 

Another use of a calibration is found when a cheap and perhaj)S biased 
method is used on all M districts, and a sample of m of them is re-measured 
by a more elaborate method. The equations apply in fact wherever a dis¬ 
trict drawn at random |)ossesscs two numbers, Xi and Tj. 

15. Stratified Sampling,--OiXjcw, for the same cost, much greater 
precision (smaller variance) in -Y or .f can be obtained by dividing the 
universe into strata so that the sampling units are as nearly as possible 
alike within strata, and as different as possible between strata. Kach 
stratum may be thought of as a separate bowl. The gain in precision is 
often disappointing; on the other hand, it is often striking. Stratification 
should be used only when it brings more information per dollar. Some 
simple theory will provide a basis for the best procedure. Kach survey 
requires separate consideration. Given a certain allowable total cost, the 
total sample n may be allocated in various ways amongst the several 
strata. The problem to be solved is to find the procedure which delivers 
most information (smallest cx) per unit cost. 

An outline of the theory of stratified sampling will now be given. 

(o) Let there be M classes, which will be designated by subscripts. The 
estimated total population will be 

1 


( 52 ) 



A ppendix 

and from (16) it may be seen by summation that 

j _ —«i A-?<ri 

^ iVi—l II; 

The total cost of the survey will be 

ht 

k = ^ koii , 


257 


(55) 


(54) 


where ki is the cost of an interview or a questionnaire in Class /. Tt can be 
shown that ay is made a minimum when tlic samplin" interval in the zth 
class is 




(55) 


where G is a proi)orli(inality constant. This equation says that ;/, should 
not only be proportional to Ni (the number of units in Class /) but also 
proportional to <r, and inversely proportional to the sejuare root <)f the 
unit cost, k„ Strictly, (55) sliould contain the factor N/(iVi — 1)7 a^< 
which is here assumed to be unity. It is said to ^ive optimum allocation. 
In j)r.acticc it must not be assumed that this allocation is always 
preferable. It will always ;;ive .smaller variance in X than any other 
allocation; but the questions to be considered are how much smaller it will 
be, and whether the Kain is worth the extra cost of separat c allocations and 
tabulations by strata. A further disadvantaj^c of optimum allocation is 
that each characteristic t(} be estimated has its own Oi and its own 
specific value of Ni/iii, Any particular value of may be ^ood for 

one characteristic, but not for the others. 

(h) For such reasons, and becau.se it yields most of the pos.siblc benefits 
of stratification, the most widely used allocation is proporlionalv allocation 
by wliich 



is a constant for all clas.ses. 

One may always estimate, in advaiu e, the dilTerence in precision to be 
expected from the three different systems: no stratification, ]>roportionatc 
allocation, optimum allocation. See, for example, (Chapter 6 in l)cmin^;’s 
“Some Theory of Samplin;^.” 

(c) Still another fjossible procc<lure is to use no stratification at all, in 
which case (16) applies, or (25) or (26) if systematic sam|)lin*' is used. 

In any practical problem it is neces.sary to comj)are all three allocations 
by computing the expected variances in A", and the costs. Some previous 



258 Population Statistics and Their CompUation 

experience is necessary in order to provide data for estimating the ai 
which enter into the calculations, and for estimating the ex|>ected costs; 
but with careful effort, excellent efficiency can be built into the planning 
of a surv^. Equations showing the advantages of one type of allocation 
over another are to be found in advanced treatises in sampling, to which 
references have been given. In particular, students who wish to proceed 
b^ond the elements of the theory presented in this Appendix should 
consult the U.S. Census bureau’s “Chapter in Population Sampling,” F. 
Yates’s “Sampling Methods for Onsuses and Surveys,” W. E. Deming’s 
“Some Theory of Sampling,” Hansen, Jlurwitz, and Madow’s “Sample 
Surv^ Methods and Theory,” and W. G. Cochran's “Sampling Tech¬ 
niques.” 


[ printidI 
IN U-5 * J 




wfTjr wnl 

Lai Bahadur Shastri National Acadamy of 
Administration Library 
MUSSOORIE / 

ffo / Acc. No- Jhfi 2 A- 

ITTW ??r jjt Tfw 

sfrwTf f I 

Please return this book on or before the 
date last stamped. 


frtft: 

Due 

Data 


9sm?rf ^ 



Due 

Borrower's 

Date 

No. 


Borrower’s 

No. 














awrfef Hvtt 

Acc. No, 

Book No.. 


LIBRARY 

LAL BAHADUR 8HA8TRI 

National Academy of Administrptlon 

IMUSSOORIE 


Aecessioa Mo. 

1. Books are Issued for 15 days only but 
may have to be recalled earlier If uroen- 
lly re«|ulred. 

2. -An over-due cbaroe of 25 False per day 
per volume will be charoed. 

3. Sooks may be renewed cn request, at 
the discretion of the Librarian. 

4. Rerlodicals* Rare and Reference books 
may not be Issued and may be consul* 
ted only In the Library. 

5. Books lost, defaced or Injured In any 
way shall have to be replaced or Its dou* 

, Me price shall be paid by the borrower. 

to kmmp tMs book frosb% cteoa B meviop 



